Language Modeling for Amharic Automatic Speech Recognition Systems

Mekonnen, Mulugeta

Language Modeling for Amharic Automatic Speech Recognition Systems

Files

Mulugeta Mekonnen.pdf (1.31 MB)

Date

2013-03

Authors

Mekonnen, Mulugeta

Publisher

Addis Ababa University

Abstract

For automatic speech recognition and other NLP tasks to be effective, a language model plays a critical role by assigning a probability to hypothesized word sequence. Various researches have been done on acoustic modelling to improve performance of Amharic SRS with no considerable effort to supplement it with proper LM. The aim of this research is to build LM for Amharic, official language of the federal government of Ethiopia, and study how it improves the performance of Amharic SRS. Accordingly, text corpus consisted of 9,079,766 tokens is prepared and various word n-gram, class-based and interpolated LMs are built using SRILM tool. Both perplexity and WRA metrics are used to evaluate the LMs. Though LMs of order 2 to 7 were built, a tetra-gram (4-gram) LM happens to be the best n-gram LM. Relative performance of different smoothing algorithms is also compared and unmodified Kneser-Ney smoothing out smarted all others. Moreover, interpolated models performed better than back-off models. With the aim of tackling data sparsity problem, different class-based LMs are also developed using IBM clustering algorithms for automatic grouping of words into clusters. Eventually, class-based LMs performed worse than word based LMs due to its generic nature. However, interpolating class-based with word-based models leads to considerable perplexity reduction over the pure word-based and class-based LMs. The word n-gram, class-based and interpolated LMs are then finally integrated to the baseline speech recognizer which has 74.52% WRA in a lattice rescoring framework. Consequently, WRA results of 80.9%, 66.0% and 82.7% have been achieved using word based n-grams, class-based and interpolated LMs respectively. Overall, an absolute 8.18% WRA gain has been achieved as a result of applying the interpolated class-based LMs to the baseline recognizer and this clearly shows LM is an indispensable part of speech recognition task. Class-based language models resulted in improved perplexity and WRA results only when combined with word-based models. Therefore, using class-based language models as a complementary tool to the word-based models is rewarding. Keywords: Amharic language modeling, Amharic class-based language modeling, Amharic text corpus.

Keywords

Amharic Language Modeling; Amharic Class-Based Language Modeling; Amharic Text Corpus.

URI

http://etd.aau.edu.et/handle/123456789/2781

Collections

Computer Science

Full item page

Language Modeling for Amharic Automatic Speech Recognition Systems

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections