Tackling Hallucinations in Chart Summarization

Obaid ul Islam, Saad

Odstraňování halucinací při sumarizaci grafů

dc.contributor.advisor	Dušek, Ondřej
dc.creator	Obaid ul Islam, Saad
dc.date.accessioned	2023-03-22T09:30:12Z
dc.date.available	2023-03-22T09:30:12Z
dc.date.issued	2023
dc.identifier.uri	http://hdl.handle.net/20.500.11956/179356
dc.description.abstract	Thesis Abstract Saad Obaid ul Islam Charles University, Saarland University Title Tackling Hallucinations in Chart Summarization Abstract Information visualizations like bar charts, line charts, and pie charts are a common way of communicating quantitative data. They are used to get important insights and make well informed decisions. Automatic Chart Summarization is the task to explain and summarize the key takeaways from the chart. Like other natural language generation (NLG) systems, chart summarization systems suffer from a phenomenon called halluci- nations. Hallucinations occur when the system generates text that is not grounded in the input. In this research work, we try to tackle the problem of hallucinations in chart summarization. Our analysis shows that a lot of additional information is present in the training data that leads to hallucinations during inference. We also found out that reducing long distance dependencies and addition of chart related information like title and legends improve the overall performance of the system. Furthermore, we propose a natural language inference (NLI) based method to clean the training data and show that our method produces faithful summaries. 1	en_US
dc.language	English	cs_CZ
dc.language.iso	en_US
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	chart-to-text generation\|natural language generation\|data-to-text generation\|neural generative models\|natural language processing\|deep learning	en_US
dc.subject	generování popisu grafu\|generování přirozeného jazyka\|generování textu z dat\|neuronové generativní modely\|zpracování přirozeného jazyka\|hluboké učení	cs_CZ
dc.title	Tackling Hallucinations in Chart Summarization	en_US
dc.type	diplomová práce	cs_CZ
dcterms.created	2023
dcterms.dateAccepted	2023-01-31
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.identifier.repId	247574
dc.title.translated	Odstraňování halucinací při sumarizaci grafů	cs_CZ
dc.contributor.referee	Rosa, Rudolf
thesis.degree.name	Mgr.
thesis.degree.level	navazující magisterské	cs_CZ
thesis.degree.discipline	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
thesis.degree.discipline	Computer Science - Language Technologies and Computational Linguistics	en_US
thesis.degree.program	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
thesis.degree.program	Computer Science - Language Technologies and Computational Linguistics	en_US
uk.thesis.type	diplomová práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
uk.degree-discipline.en	Computer Science - Language Technologies and Computational Linguistics	en_US
uk.degree-program.cs	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
uk.degree-program.en	Computer Science - Language Technologies and Computational Linguistics	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.en	Thesis Abstract Saad Obaid ul Islam Charles University, Saarland University Title Tackling Hallucinations in Chart Summarization Abstract Information visualizations like bar charts, line charts, and pie charts are a common way of communicating quantitative data. They are used to get important insights and make well informed decisions. Automatic Chart Summarization is the task to explain and summarize the key takeaways from the chart. Like other natural language generation (NLG) systems, chart summarization systems suffer from a phenomenon called halluci- nations. Hallucinations occur when the system generates text that is not grounded in the input. In this research work, we try to tackle the problem of hallucinations in chart summarization. Our analysis shows that a lot of additional information is present in the training data that leads to hallucinations during inference. We also found out that reducing long distance dependencies and addition of chart related information like title and legends improve the overall performance of the system. Furthermore, we propose a natural language inference (NLI) based method to clean the training data and show that our method produces faithful summaries. 1	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	1
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O

Soubory tohoto záznamu

Název:: 120437011.pdf
Velikost:: 2.246Mb
Formát:: application/pdf
Popis:: Text práce

Zobrazit/otevřít

Název:: 120437012.pdf
Velikost:: 52.33Kb
Formát:: application/pdf
Popis:: Abstrakt (anglicky)

Zobrazit/otevřít

Název:: 120439918.pdf
Velikost:: 525.2Kb
Formát:: application/pdf
Popis:: Posudek vedoucího

Zobrazit/otevřít

Název:: 120439690.pdf
Velikost:: 79.59Kb
Formát:: application/pdf
Popis:: Posudek oponenta

Zobrazit/otevřít

Název:: 120440350.pdf
Velikost:: 347.3Kb
Formát:: application/pdf
Popis:: Záznam o průběhu obhajoby

Zobrazit/otevřít

Tento záznam se objevuje v následujících sbírkách

Kvalifikační práce [11199]
Theses

Zobrazit minimální záznam