Zobrazit minimální záznam

Corpus of the Czech language of the 2nd half of the 19th century
dc.contributor.authorKučera, Karel
dc.contributor.authorNajbrtová, Kateřina
dc.contributor.authorPivoňková, Klára
dc.contributor.authorŘehořková, Anna
dc.contributor.authorStluka, Martin
dc.date.accessioned2019-07-11T08:27:42Z
dc.date.available2019-07-11T08:27:42Z
dc.date.issued2019
dc.identifier.urihttp://hdl.handle.net/20.500.11956/107808
dc.description.abstractThe paper describes the principles and structure of the one-million-word DIA1900 Corpus built at the Institute of the Czech National Corpus (CNC) in Prague, focused on the language of Czech texts published in the years 1851 to 1900. The DIA1900, planned for publication by June 2020 and to be followed by the DIA1850 (a corpus built around the same principles, with the focus on the first half of the 19th century), observes both the balanced representation of the three major text types (belles lettres — journalistic texts — technical/scientific texts) and the system of morphological tagging implemented in the synchronic corpora included in the CNC project, thus facilitating the diachronic comparison of two stages in the development of Czech. A brief description is given of the structure of the morphological terminology used in the lemmatisation and tagging of the corpus, and of two tools designed to help search the 19th century texts with their fluctuating orthographic consistency combined with phonological and morphological variation characteristics of the language of the period: (1) a multiple select/suggest feature (reminding the user of the existence of non-standard orthographic and phonological variants of the lemma found in the corpus before the lemma search is started) and (2) the position attribute (informing the user of the ambiguous status of a word in the text, resulting from a misprint or misspelling, damaged page etc.).en
dc.formatpdf
dc.language.isocs
dc.publisherUniverzita Karlova, Filozofická fakulta
dc.sourceČasopis pro moderní filologii (Journal for Modern Philology), 2019, 101, 1, 92-98
dc.titleKorpus českého jazyka 2. poloviny 19. stoletícs
dc.typeVědecký článekcs
dcterms.accessRightsopenAccess
dcterms.licensehttp://creativecommons.org/licenses/by-nc-nd/2.0/
dc.title.translatedCorpus of the Czech language of the 2nd half of the 19th centuryen
dc.publisher.publicationPlacePraha
uk.internal-typeuk_publication
dc.identifier.doi10.14712/23366591.2019.1.6
dc.description.startPage92
dc.description.endPage98
dcterms.isPartOf.nameČasopis pro moderní filologii (Journal for Modern Philology)cs
dcterms.isPartOf.journalYear2019
dcterms.isPartOf.journalVolume2019
dcterms.isPartOf.journalIssue1
dcterms.isPartOf.issn2336-6591
dc.relation.isPartOfUrlhttps://casopispromodernifilologii.ff.cuni.cz
dc.subject.keyworddiachronní korpuscs
dc.subject.keywordlemmatizacecs
dc.subject.keywordmorfologické značkovánícs
dc.subject.keywordpoobrozenská češtinacs
dc.subject.keywordčeština 19. stoletícs
dc.subject.keywordhlásková variabilitacs
dc.subject.keywordpravopisná variabilitacs
dc.subject.keywordmorfologická variabilitacs
dc.subject.keyworddiachronic corpusen
dc.subject.keywordlemmatisationen
dc.subject.keywordmorphological taggingen
dc.subject.keywordpost-national revival Czechen
dc.subject.keyword19th century Czechen
dc.subject.keywordphonological variabilityen
dc.subject.keywordorthographic variabilityen
dc.subject.keywordmorphological variabilityen


Soubory tohoto záznamu

Thumbnail

Tento záznam se objevuje v následujících sbírkách

Zobrazit minimální záznam


© 2017 Univerzita Karlova, Ústřední knihovna, Ovocný trh 560/5, 116 36 Praha 1; email: admin-repozitar [at] cuni.cz

Za dodržení všech ustanovení autorského zákona jsou zodpovědné jednotlivé složky Univerzity Karlovy. / Each constituent part of Charles University is responsible for adherence to all provisions of the copyright law.

Upozornění / Notice: Získané informace nemohou být použity k výdělečným účelům nebo vydávány za studijní, vědeckou nebo jinou tvůrčí činnost jiné osoby než autora. / Any retrieved information shall not be used for any commercial purposes or claimed as results of studying, scientific or any other creative activities of any person other than the author.

DSpace software copyright © 2002-2015  DuraSpace
Theme by 
@mire NV