Details

Title Automatic Detection of German Loanwords in English Versions of European Universities’ Websites: выпускная квалификационная работа магистра: направление 45.04.04 «Интеллектуальные системы в гуманитарной среде» ; образовательная программа 45.04.04_01 «Цифровая лингвистика (международная образовательная программа)/Digital Linguistics (International Educational Program)»
Creators Крупнова Елена Сергеевна
Scientific adviser Коган Марина Самуиловна
Organization Санкт-Петербургский политехнический университет Петра Великого. Гуманитарный институт
Imprint Санкт-Петербург, 2024
Collection Выпускные квалификационные работы; Общая коллекция
Subjects borrowing; German loanwords; Germanisms; foreign words; assimilation; cultural contacts; European universities websites; “Web-as-Corpus” approach; corpus of texts; parsing; corpus annotation; multilingual BERT model
Document type Master graduation qualification work
File type PDF
Language Russian
Level of education Master
Speciality code (FGOS) 45.04.04
Speciality group (FGOS) 450000 - Языкознание и литературоведение
DOI 10.18720/SPBPU/3/2024/vr/vr24-5801
Rights Доступ по паролю из сети Интернет (чтение)
Additionally New arrival
Record key ru\spstu\vkr\33249
Record create date 8/29/2024

Allowed Actions

Action 'Read' will be available if you login or access site from another network

Group Anonymous
Network Internet

In the given master’s dissertation, the automatic search for German loanwords in English versions of European universities’ websites is considered. Definitions of the concepts “borrowing” and “assimilation” are given; causes of this linguistic process, signs of assimilation and stages of adaptation of borrowed vocabulary are identified; main classifications and ways of borrowing of foreign words from a donor to a recipient language are considered; historical periods of the penetration of foreign lexical units of different origin into English are determined. Methods for automatic extraction of loanwords of different languages are considered as well as “Web-as-Corpus” approach. In the practical part, a corpus containing 22465 sentences of two sections: “History” and “News” was automatically built on 11 English-version websites of European universities. It was manually labeled with two tags “no German loanword” or “German loanword”. After that, the multilingual language model BERT was finetuned on it, and the method for automatic extraction of Germanisms from texts was developed. In addition, 42 found Germanisms were analyzed in detail in the aspect of orthography, pronunciation, morphology and semantics. All loanwords were divided into three groups according to their degree of assimilation: fully assimilated; partially assimilated and unassimilated words denoting concepts peculiar to the source language.

Network User group Action
ILC SPbPU Local Network All
Read
Internet Authorized users SPbPU
Read
Internet Anonymous

Access count: 2 
Last 30 days: 2

Detailed usage statistics