Vzdálené čtení současné české beletrie

Panušková, Charlotte

Distant reading of contemporary Czech fiction

dc.contributor.advisor	Pýcha, Čeněk
dc.creator	Panušková, Charlotte
dc.date.accessioned	2024-04-08T11:35:44Z
dc.date.available	2024-04-08T11:35:44Z
dc.date.issued	2024
dc.identifier.uri	http://hdl.handle.net/20.500.11956/188349
dc.description.abstract	This thesis explores the topic modelling of contemporary Czech prose using LDA and Top2Vec algorithms. It examines how the results of topic modelling correspond to existing knowledge in literary history and further analyses how these findings relate to classical literary theory. The study emphasizes the connection between digital methods of text analysis and traditional literary- historical and theoretical perspectives, offering a new interpretation of modern methods within the literary context. For modelling purposes, the corpus from the Czech National Corpus was used. The corpus was cleaned and divided into three subcorpora based on the publication date of the works. Models of both LDA and Top2Vec algorithms were created from all three subcorpora. To select the most accurate model, the thesis employs the coherence score metric Cv. The results of the models are then compared with present knowledge in literary history. The conclusion underscores that topic modelling serves as an approximation of the literary system rather than a direct means of revealing themes.	en_US
dc.description.abstract	Tato práce se zabývá tematickým modelováním současné české prózy pomocí algoritmů LDA a Top2Vec. Zkoumá, jak výsledky tematického modelování korespondují s dosavadními poznatky literární historie. Dále pak analyzuje, jak se tyto výsledky promítají do klasické literární teorie. Práce tak klade důraz na propojení mezi digitálními metodami analýzy textů a klasickými literárněhistorickými a teoretickými pohledy, čímž přináší nový pohled na interpretaci moderních metod v literárním kontextu. K modelování byl využit veřejně dostupný korpus Českého národního korpusu. Korpus byl pro účely práce očištěn a rozdělen do tří subkorpusů podle data prvního vydání děl. Modely algoritmů LDA a Top2Vec byly vytvořeny ze všech tří subkorpusů. Pro výběr nejpřesnějšího modelu práce využívá metriku skóre koherence Cv. Výsledky modelů jsou následně porovnány s dosavadními poznatky literární historie. Práce na závěr zdůrazňuje, že tematické modelování představuje spíše aproximaci literárního systému než prostředek k přímému odhalování témat.	cs_CZ
dc.language	Čeština	cs_CZ
dc.language.iso	cs_CZ
dc.publisher	Univerzita Karlova, Filozofická fakulta	cs_CZ
dc.subject	vzdálené čtení\|digitální literární věda\|tematické modelování\|digitální humanitní vědy\|současná česká próza	cs_CZ
dc.subject	digital literary studies\|digital humanitites\|topic modelling\|contemporary czech fiction\|distant reading	en_US
dc.title	Vzdálené čtení současné české beletrie	cs_CZ
dc.type	diplomová práce	cs_CZ
dcterms.created	2024
dcterms.dateAccepted	2024-01-29
dc.description.department	Institute of Information Studies and Librarianship - New Media Studies	en_US
dc.description.department	Ústav informačních studií - studia nových médií	cs_CZ
dc.description.faculty	Faculty of Arts	en_US
dc.description.faculty	Filozofická fakulta	cs_CZ
dc.identifier.repId	260321
dc.title.translated	Distant reading of contemporary Czech fiction	en_US
dc.contributor.referee	Šlerka, Josef
thesis.degree.name	Mgr.
thesis.degree.level	navazující magisterské	cs_CZ
thesis.degree.discipline	New Media Studies	en_US
thesis.degree.discipline	Studia nových médií	cs_CZ
thesis.degree.program	New Media Studies	en_US
thesis.degree.program	Studia nových médií	cs_CZ
uk.thesis.type	diplomová práce	cs_CZ
uk.taxonomy.organization-cs	Filozofická fakulta::Ústav informačních studií - studia nových médií	cs_CZ
uk.taxonomy.organization-en	Faculty of Arts::Institute of Information Studies and Librarianship - New Media Studies	en_US
uk.faculty-name.cs	Filozofická fakulta	cs_CZ
uk.faculty-name.en	Faculty of Arts	en_US
uk.faculty-abbr.cs	FF	cs_CZ
uk.degree-discipline.cs	Studia nových médií	cs_CZ
uk.degree-discipline.en	New Media Studies	en_US
uk.degree-program.cs	Studia nových médií	cs_CZ
uk.degree-program.en	New Media Studies	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.cs	Tato práce se zabývá tematickým modelováním současné české prózy pomocí algoritmů LDA a Top2Vec. Zkoumá, jak výsledky tematického modelování korespondují s dosavadními poznatky literární historie. Dále pak analyzuje, jak se tyto výsledky promítají do klasické literární teorie. Práce tak klade důraz na propojení mezi digitálními metodami analýzy textů a klasickými literárněhistorickými a teoretickými pohledy, čímž přináší nový pohled na interpretaci moderních metod v literárním kontextu. K modelování byl využit veřejně dostupný korpus Českého národního korpusu. Korpus byl pro účely práce očištěn a rozdělen do tří subkorpusů podle data prvního vydání děl. Modely algoritmů LDA a Top2Vec byly vytvořeny ze všech tří subkorpusů. Pro výběr nejpřesnějšího modelu práce využívá metriku skóre koherence Cv. Výsledky modelů jsou následně porovnány s dosavadními poznatky literární historie. Práce na závěr zdůrazňuje, že tematické modelování představuje spíše aproximaci literárního systému než prostředek k přímému odhalování témat.	cs_CZ
uk.abstract.en	This thesis explores the topic modelling of contemporary Czech prose using LDA and Top2Vec algorithms. It examines how the results of topic modelling correspond to existing knowledge in literary history and further analyses how these findings relate to classical literary theory. The study emphasizes the connection between digital methods of text analysis and traditional literary- historical and theoretical perspectives, offering a new interpretation of modern methods within the literary context. For modelling purposes, the corpus from the Czech National Corpus was used. The corpus was cleaned and divided into three subcorpora based on the publication date of the works. Models of both LDA and Top2Vec algorithms were created from all three subcorpora. To select the most accurate model, the thesis employs the coherence score metric Cv. The results of the models are then compared with present knowledge in literary history. The conclusion underscores that topic modelling serves as an approximation of the literary system rather than a direct means of revealing themes.	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Filozofická fakulta, Ústav informačních studií - studia nových médií	cs_CZ
thesis.grade.code	1
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O