Gender Associations in the Czech Lexicon and Their Impact on Language Processing

Preininger, Mikuláš

Genderové asociace v české slovní zásobě a jejich vliv na zpracování jazyka

dissertation thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (334.7Kb)

Permanent link

http://hdl.handle.net/20.500.11956/199705

Identifiers

Study Information System: 228326

Referee

Šimík, Radek

Keuleers, Emmanuel

Faculty / Institute

Faculty of Arts

Discipline

Czech Language

Department

Institute of Czech Language and Theory of Communication

Date of defense

3. 6. 2025

Publisher

Univerzita Karlova, Filozofická fakulta

Language

English

Grade

Pass

Keywords (Czech)

lexikální normy|genderové asociace|jmenný rod|generické maskulinum

Keywords (English)

lexical norms|gender associations|grammatical gender|masculine generics

Hlavním cílem této práce je prozkoumat možnosti využití lexikálních norem ve výzkumu jazyka a genderu. V posledních letech bylo publikováno několik souborů dat obsahujících hodnocení tisíců slovních významů z hlediska genderových asociací (např. Scott et al., 2019; Vankrunkelsven et al., 2024). Tato data se ukázala být cenná jednak jako zdroj kontrolovaných stimulů v experimentálních studiích, jednak jako validovaný zdroj údajů v observačním výzkumu. Samotné genderové normy však dosud nebyly důkladně prozkoumány, aby bylo možné identifikovat potenciálně nežádoucí faktory, které hodnocení ovlivňují. Hlubší explorace by také mohla přinést poznatky o sémantické struktuře genderových asociací a vést k formulaci nových výzkumných otázek. Tato práce má zacíl tuto mezeru zaplnit a prozkoumat hodnocení ze souboru dat Sociolex, který zachycuje genderové asociace pro 3,000 významů českých slov (Preininger et al., in prep). Druhá kapitola představuje data ze sady Sociolex, popisuje metodologii studie a distribuci hodnocení. Ačkoli se soubor dat ukázal být kvalitní, analýzy ukazují, že hodnocení genderových asociací je systematicky ovlivňováno různými faktory - především jmenným rodem, který ovlivnil asociace dokonce i u neživotných podstatných jmen. Třetí kapitola porovnává hodnocení genderových asociací s dalšími...

Abstract (English)

The overarching aim of this thesis is to examine how human ratings of word meanings can be used in research on language and gender. In recent years, multiple datasets containing human judgments on gender associations for thousands of word meanings have been published (e.g., Scott et al., 2019; Vankrukelsven et al., 2024). These norming datasets have proven valuable both as sources of controlled stimuli in experimental studies and as information resources in observational research. However, gender norms themselves have yet to be thoroughly examined to identify potential confounding factors that may influence the actual ratings. A deeper exploration could also reveal insights into the semantic structure of gender associations and inspire new research questions. This thesis addresses this gap by analyzing gender ratings from the Sociolex dataset, which captures gender associations for 3,000 Czech word meanings (Preininger et al., submitted). Chapter 2 introduces the Sociolex norms, describing their methodology and distribution of the ratings. While the dataset was shown to be of high quality, analyses reveal that gender ratings are systematically influenced by various factors - most notably, grammatical gender, which is highly pervasive in the grammar of Czech and affects associations even for inanimate nouns...

Citace dokumentu

Metadata

Show full item record