Details
Title | Automatic Detection of German Loanwords in English Versions of European Universities’ Websites: выпускная квалификационная работа магистра: направление 45.04.04 «Интеллектуальные системы в гуманитарной среде» ; образовательная программа 45.04.04_01 «Цифровая лингвистика (международная образовательная программа)/Digital Linguistics (International Educational Program)» |
---|---|
Creators | Крупнова Елена Сергеевна |
Scientific adviser | Коган Марина Самуиловна |
Organization | Санкт-Петербургский политехнический университет Петра Великого. Гуманитарный институт |
Imprint | Санкт-Петербург, 2024 |
Collection | Выпускные квалификационные работы; Общая коллекция |
Subjects | borrowing; German loanwords; Germanisms; foreign words; assimilation; cultural contacts; European universities websites; “Web-as-Corpus” approach; corpus of texts; parsing; corpus annotation; multilingual BERT model |
Document type | Master graduation qualification work |
File type | |
Language | Russian |
Level of education | Master |
Speciality code (FGOS) | 45.04.04 |
Speciality group (FGOS) | 450000 - Языкознание и литературоведение |
DOI | 10.18720/SPBPU/3/2024/vr/vr24-5801 |
Rights | Доступ по паролю из сети Интернет (чтение) |
Additionally | New arrival |
Record key | ru\spstu\vkr\33249 |
Record create date | 8/29/2024 |
Allowed Actions
–
Action 'Read' will be available if you login or access site from another network
Group | Anonymous |
---|---|
Network | Internet |
In the given master’s dissertation, the automatic search for German loanwords in English versions of European universities’ websites is considered. Definitions of the concepts “borrowing” and “assimilation” are given; causes of this linguistic process, signs of assimilation and stages of adaptation of borrowed vocabulary are identified; main classifications and ways of borrowing of foreign words from a donor to a recipient language are considered; historical periods of the penetration of foreign lexical units of different origin into English are determined. Methods for automatic extraction of loanwords of different languages are considered as well as “Web-as-Corpus” approach. In the practical part, a corpus containing 22465 sentences of two sections: “History” and “News” was automatically built on 11 English-version websites of European universities. It was manually labeled with two tags “no German loanword” or “German loanword”. After that, the multilingual language model BERT was finetuned on it, and the method for automatic extraction of Germanisms from texts was developed. In addition, 42 found Germanisms were analyzed in detail in the aspect of orthography, pronunciation, morphology and semantics. All loanwords were divided into three groups according to their degree of assimilation: fully assimilated; partially assimilated and unassimilated words denoting concepts peculiar to the source language.
Network | User group | Action |
---|---|---|
ILC SPbPU Local Network | All |
|
Internet | Authorized users SPbPU |
|
Internet | Anonymous |
|
Access count: 2
Last 30 days: 2