Detekcia intenzity v postojovej analýze češtiny

Dargaj, Jakub

Detection of Intensity in Sentiment Analysis of Czech
Detekce intenzity v postojové analýze češtiny

dc.contributor.advisor	Tamchyna, Aleš
dc.creator	Dargaj, Jakub
dc.date.accessioned	2024-08-09T14:10:57Z
dc.date.available	2024-08-09T14:10:57Z
dc.date.issued	2017
dc.identifier.uri	http://hdl.handle.net/20.500.11956/86211
dc.description.abstract	Postojová analýza sa zaoberá automatickou extrakciou subjektívnych informácií z textu. Cieľom práce je predpovedať intenzitu postoja v českých textoch. Na riešenie tejto úlohy sme pripravili dataset filmových hodnotení užívateľov Česko-Slovenskej filmovej databázy. Porovnávame niekoľko metód strojového učenia, pričom sa zameriavame na extrakciu číselných atribútov z textových dát. S využitím konvolučných neurónových sietí a korpusovo závislého trénovania vektorových reprezentácií slov sa nám podarilo prekonať základné modely a dosiahnuť presnosť podobnú najnovším výsledkom v tejto oblasti. V práci taktiež analyzujeme model logistickej regresie na porovnanie použitých jazykových prostriedkov medzi recenziami s rôznymi stupňami hodnotenia.	cs_CZ
dc.description.abstract	Sentiment analysis is concerned with automatic extraction of subjective information from text. The goal of this thesis is to predict the intensity of attitude in Czech texts. In order to solve this task, we prepared a dataset of movie reviews by users of Czech-Slovak Film Database. We compare several machine learning methods, focusing on feature extraction from text data. Using convolutional neural networks and corpus-dependent training of word embeddings, we surpassed basic models and achieved accuracy similar to the most recent results in this field. We also analyze the logistic regression model in order to compare the vocabulary used in reviews with different ratings.	en_US
dc.language	Slovenčina	cs_CZ
dc.language.iso	sk_SK
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	postojová analýza	cs_CZ
dc.subject	strojové učení	cs_CZ
dc.subject	počítačová lingvistika	cs_CZ
dc.subject	sentiment analysis	en_US
dc.subject	machine learning	en_US
dc.subject	computational linguistics	en_US
dc.title	Detekcia intenzity v postojovej analýze češtiny	sk_SK
dc.type	bakalářská práce	cs_CZ
dcterms.created	2017
dcterms.dateAccepted	2017-06-20
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.identifier.repId	188691
dc.title.translated	Detection of Intensity in Sentiment Analysis of Czech	en_US
dc.title.translated	Detekce intenzity v postojové analýze češtiny	cs_CZ
dc.contributor.referee	Mareček, David
thesis.degree.name	Bc.
thesis.degree.level	bakalářské	cs_CZ
thesis.degree.discipline	General Computer Science	en_US
thesis.degree.discipline	Obecná informatika	cs_CZ
thesis.degree.program	Computer Science	en_US
thesis.degree.program	Informatika	cs_CZ
uk.thesis.type	bakalářská práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Obecná informatika	cs_CZ
uk.degree-discipline.en	General Computer Science	en_US
uk.degree-program.cs	Informatika	cs_CZ
uk.degree-program.en	Computer Science	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.cs	Postojová analýza sa zaoberá automatickou extrakciou subjektívnych informácií z textu. Cieľom práce je predpovedať intenzitu postoja v českých textoch. Na riešenie tejto úlohy sme pripravili dataset filmových hodnotení užívateľov Česko-Slovenskej filmovej databázy. Porovnávame niekoľko metód strojového učenia, pričom sa zameriavame na extrakciu číselných atribútov z textových dát. S využitím konvolučných neurónových sietí a korpusovo závislého trénovania vektorových reprezentácií slov sa nám podarilo prekonať základné modely a dosiahnuť presnosť podobnú najnovším výsledkom v tejto oblasti. V práci taktiež analyzujeme model logistickej regresie na porovnanie použitých jazykových prostriedkov medzi recenziami s rôznymi stupňami hodnotenia.	cs_CZ
uk.abstract.en	Sentiment analysis is concerned with automatic extraction of subjective information from text. The goal of this thesis is to predict the intensity of attitude in Czech texts. In order to solve this task, we prepared a dataset of movie reviews by users of Czech-Slovak Film Database. We compare several machine learning methods, focusing on feature extraction from text data. Using convolutional neural networks and corpus-dependent training of word embeddings, we surpassed basic models and achieved accuracy similar to the most recent results in this field. We also analyze the logistic regression model in order to compare the vocabulary used in reviews with different ratings.	en_US
uk.file-availability	P
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	1
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O
dc.identifier.lisID	990021442800106986