Understanding cross-lingual abilities in large multilingual language models
Porozumění mezijazykovým vlastnostem ve velkých vícejazyčných jazykových modelech
diploma thesis (DEFENDED)
View/ Open
Permanent link
http://hdl.handle.net/20.500.11956/184175Identifiers
Study Information System: 257456
Collections
- Kvalifikační práce [10932]
Author
Advisor
Referee
Limisiewicz, Tomasz
Faculty / Institute
Faculty of Mathematics and Physics
Discipline
Computer Science - Language Technologies and Computational Linguistics
Department
Institute of Formal and Applied Linguistics
Date of defense
6. 9. 2023
Publisher
Univerzita Karlova, Matematicko-fyzikální fakultaLanguage
English
Grade
Excellent
Keywords (Czech)
transfer learning|cross-lingual learning|low-resource|language modelsKeywords (English)
transfer learning|cross-lingual learning|low-resource|language modelsCross-lingual abilities have been evident in large multilingual language models over the past few years. However, understanding why and under what circumstances they work is not entirely clear. In this work, we work towards a better understanding of these aspects in a specific subset of multilingual models, namely modular multilingual models with cross-lingual transfer learning abilities. We try to quantify claims in Pfeiffer et al. [2022] regarding their proposed model, X-MOD, as it was tested in a very specific setting which may not align with common low-resource settings. Specifically, we evaluate how the following factors may affect downstream performance: the amount of available pre- training data; hyperparameters such as number of training steps, checkpoint selection criteria, available overlapping lexicon. With the help of our findings, we also aim to provide guidelines on how to best use X-MOD, especially from a low-resource perspective. 1