Detekce chyb v rozpoznávání mluvené řeči

Tobolíková, Petra

Error detection in speech recognition

diploma thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (299.3Kb)

Permanent link

http://hdl.handle.net/20.500.11956/14862

Identifiers

Study Information System: 44108

CU Caralogue: 990011220130106986

Referee

Peterek, Nino

Faculty / Institute

Faculty of Mathematics and Physics

Discipline

Computational Linguistics

Department

Institute of Formal and Applied Linguistics

Date of defense

26. 5. 2008

Publisher

Univerzita Karlova, Matematicko-fyzikální fakulta

Language

Czech

Grade

Excellent

Tématem této diplomové práce je detekce chyb v rozpoznávání mluvené řeči. Nejprve jsou stručně představeny principy současného rozpoznávání řeči. Jsou nastíněny problémy, se kterými se rozpoznávání řeči potýká a které způsobují, že stále nefunguje bezchybně. Dále jsou uvedeny stávající známé metody výpočtu tzv. skóre spolehlivosti. V následující části jsou popsány tři metody strojového učení, které byly využity pro implementovanou detekci chyb: logistická regrese, neuronové sítě a rozhodovací stromy. Poté jsou navrženy atributy slov v rozpoznaných větách, které jsou použity jako vstupní proměnné metod strojového učení. Výstupní proměnnou je odhad skóre spolehlivosti. Je zde předveden způsob, jakým byly využity implementace metod strojového učení v softwaru R. Metody byly testovány na nahrávkách českého rádia a televize. Výsledky jednotlivých metod jsou porovnány pomocí křivek ROC, směrodatné chyby detekce a možnosti redukce WER v rozpoznaných větách. Je připojen rovněž popis programu, který je součástí práce. Na závěr jsou shrnuty vlastnosti slova, které se osvědčily jako účinné atributy při detekci chyb.

Abstract (English)

This thesis tackles the problem of error detection in speech recognition. First, principles of recent approaches to automatic speech recognition are introduced. Various deficiencies of speech recognition that cause imperfect recognition results are outlined. Current known methods of "confidence score" computation are then described. The next chapter introduces three machine learning algorithms which where employed in the error detection methods implemented in this thesis: logistic regression, artificial neural networks and decision trees. This machine learning methods use certain attributes of the recognized words as input variables and predict an estimated confidence score value. The open source software "R" has been used throughout, showing the usage of the aforementioned methods. These methods have been tested on Czech radio and TV broadcasts. The results obtained by those methods are compared using ROC curves, standard errors and possible (oracle) WER reduction. Programming documentation of the code used in the implementation is enclosed as well. Finally, efficient word attributes for error detection are summarized.

Citace dokumentu

Metadata

Show full item record