Details
Title | Optimizing the Process of Subtitling Educational Videos with Machine Translation Algorithms: выпускная квалификационная работа магистра: направление 45.04.04 «Интеллектуальные системы в гуманитарной среде» ; образовательная программа 45.04.04_01 «Цифровая лингвистика (международная образовательная программа)/Digital Linguistics (International Educational Program)» |
---|---|
Creators | Лаврентьева Екатерина Петровна |
Scientific adviser | Коган Марина Самуиловна |
Organization | Санкт-Петербургский политехнический университет Петра Великого. Гуманитарный институт |
Imprint | Санкт-Петербург, 2024 |
Collection | Выпускные квалификационные работы; Общая коллекция |
Subjects | subtitling; audiovisual texts; educational videos; machine translation; audiovisual translation |
Document type | Master graduation qualification work |
File type | |
Language | Russian |
Level of education | Master |
Speciality code (FGOS) | 45.04.04 |
Speciality group (FGOS) | 450000 - Языкознание и литературоведение |
DOI | 10.18720/SPBPU/3/2024/vr/vr24-5800 |
Rights | Доступ по паролю из сети Интернет (чтение, печать, копирование) |
Additionally | New arrival |
Record key | ru\spstu\vkr\33248 |
Record create date | 8/29/2024 |
Allowed Actions
–
Action 'Read' will be available if you login or access site from another network
Action 'Download' will be available if you login or access site from another network
Group | Anonymous |
---|---|
Network | Internet |
The graduate qualification work investigates the possibility of integrating machine translation algorithms into the task of subtitling educational video content and, in particular, into the process of creating high-quality English subtitles for Russian video lectures on linguistics from the YouTube channel Postnauka. The method chosen to improve the quality of machine translation-generated subtitles for the lectures is fine-tuning a machine translation model on a large genre-specific corpus of video lecture subtitles. A number of theoretical issues was studied to better understand the problem of automatically subtitling educational videos, such as the nature of the audiovisual text and one of its genres, the educational video, the specifics of audiovisual translation (AVT) and subtitling, and the evolution of and the state-of-the-art in machine translation (MT) algorithms. Previous work in the research area of applying MT in the task of AVT were analyzed and drawn upon. As a result of conducting the experimental study, it was determined that machine translation models translate subtitles coherently when their formatting is changed to the format one line = one sentence. The translation is adequate and fluent except for terminology, which the models struggle to recognize and translate correctly in many cases. After the model was fine-tuned, it started recognizing more terms. These conclusions prove that it is possible to apply machine translation algorithms to subtitling educational videos; however, some pre-editing needs to be done, such as changing the format of the text. The quality of term translation can be improved by augmenting the training corpus with domain-specific data, which is often done with the help of back-translation.
Network | User group | Action |
---|---|---|
ILC SPbPU Local Network | All |
|
Internet | Authorized users SPbPU |
|
Internet | Anonymous |
|
Access count: 4
Last 30 days: 3