Zobrazit minimální záznam

dc.contributor.authorKvapilíková, Ivana
dc.date.accessioned2025-06-16T13:56:13Z
dc.date.available2025-06-16T13:56:13Z
dc.date.issued2025-06
dc.identifier.isbn9788024660783
dc.identifier.urihttp://hdl.handle.net/20.500.11956/198746
dc.description.abstractFor decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide direct correspondences between source and target sentences. The notion that translation systems could be trained on non-parallel texts, independently written in different languages, was long considered unrealistic. Fast forward to the era of large language models (LLMs), and we now know that given their sufficient computational resources, LLMs exploit incidental parallelism in their vast training data, i.e., they identify parallel messages across languages and learn to translate without explicit supervision. LLMs have since demonstrated the ability to perform translation tasks with impressive quality, rivaling systems specifically trained for translation. This monograph explores the fascinating journey that led to this point, focusing on the development of unsupervised machine translation. Long before the rise of LLMs, researchers were exploring the idea that translation could be achieved without parallel data. Their efforts centered on motivating models to discover cross-lingual correspondences through various techniques, such as the mapping of word embedding spaces, back-translation, or parallel sentence mining. Although much of the research described in this monograph predates the mainstream adoption of LLMs, the insights gained remain highly relevant. They offer a foundation for understanding how and why LLMs are able to translate.en
dc.language.isoen
dc.publisherNakladatelství Karolinumcs
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectlinguisticsen
dc.subjecttranslationen
dc.subjectlanguageen
dc.subjectLLMen
dc.subjectmachine translationen
dc.titleUnsupervised Machine Translation: How Machines Learn to Understand Across Languagesen
dc.typeknihacs_CZ
dc.typebooken_US
dcterms.accessRightsopenAccess
dcterms.extent176
uk.abstract.enFor decades, machine translation between natural languages fundamentally relied on human-translated documents known as parallel texts, which provide direct correspondences between source and target sentences. The notion that translation systems could be trained on non-parallel texts, independently written in different languages, was long considered unrealistic. Fast forward to the era of large language models (LLMs), and we now know that given their sufficient computational resources, LLMs exploit incidental parallelism in their vast training data, i.e., they identify parallel messages across languages and learn to translate without explicit supervision. LLMs have since demonstrated the ability to perform translation tasks with impressive quality, rivaling systems specifically trained for translation. This monograph explores the fascinating journey that led to this point, focusing on the development of unsupervised machine translation. Long before the rise of LLMs, researchers were exploring the idea that translation could be achieved without parallel data. Their efforts centered on motivating models to discover cross-lingual correspondences through various techniques, such as the mapping of word embedding spaces, back-translation, or parallel sentence mining. Although much of the research described in this monograph predates the mainstream adoption of LLMs, the insights gained remain highly relevant. They offer a foundation for understanding how and why LLMs are able to translate.en
dc.publisher.publicationPlacePrahacs
uk.internal-typeuk_publication
oaire.fundingReference.awardNumber19-26934X
oaire.fundingReference.funderNameGrantová agentura České republikycs
oaire.fundingReference.fundingStreamNeural Representations in Multi-modal and Multi-lingual Modelingen
dc.identifier.isbnPDF9788024660844


Soubory tohoto záznamu

Thumbnail

Tento záznam se objevuje v následujících sbírkách

Zobrazit minimální záznam

https://creativecommons.org/licenses/by/4.0/
Kromě případů, kde je uvedeno jinak, licence tohoto záznamu je https://creativecommons.org/licenses/by/4.0/

© 2025 Univerzita Karlova, Ústřední knihovna, Ovocný trh 560/5, 116 36 Praha 1; email: admin-repozitar [at] cuni.cz

Za dodržení všech ustanovení autorského zákona jsou zodpovědné jednotlivé složky Univerzity Karlovy. / Each constituent part of Charles University is responsible for adherence to all provisions of the copyright law.

Upozornění / Notice: Získané informace nemohou být použity k výdělečným účelům nebo vydávány za studijní, vědeckou nebo jinou tvůrčí činnost jiné osoby než autora. / Any retrieved information shall not be used for any commercial purposes or claimed as results of studying, scientific or any other creative activities of any person other than the author.

DSpace software copyright © 2002-2015  DuraSpace
Theme by 
@mire NV