AAU Institutional Repository

Synthetic Speech Trained - Large Vocabulary Amharic Speech Recognition System (SST-LVASR)

Show simple item record

dc.contributor.advisor Mamo, Mengesha(PhD)
dc.contributor.author Birile, Mesfin
dc.date.accessioned 2018-06-26T07:18:03Z
dc.date.available 2018-06-26T07:18:03Z
dc.date.issued 2008-07
dc.identifier.uri http://etd.aau.edu.et/handle/123456789/3510
dc.description.abstract Amharic is the official language of Ethiopia, which is characterized by very large morphological forms of words. This thesis is an investigation of the possibility of developing an Automatic speech recognition system (ASR) for Amharic using synthesized Amharic speech generated through concatenation of prerecorded morphemes, can be used to train a hidden markov model (HMM) based ASR system. The development of HMM based ASR system requires identification of all possible words and a construction of text and speech corpora containing multiple samples of the words to be recognized by the system. These data are then used as training sets in the development of the models, the final objective being the construction of HMM models for each recognition unit. Since there are a large number of morphological forms for the words in Amharic, the effort of collecting the Amharic words for constructing the text corpus and the recording and labeling of the same words for the speech corpus is extremely difficult. This thesis demonstrates that by developing an automatic morphological expander, the effort of developing the text corpus is reduced to a manageable level. Additionally, a significant reduction in the speech corpus development is achieved by using machine generated speech for training the HMM models of the ASR system. These reductions in the development efforts of the text and speech corpora greatly reduce the most prominent of the obstacles in developing a general purpose Amharic speech recognizer. The 62.37% word accuracy for naturally recorded speech indicates that using synthetic speech for training at least 62% of the words are correctly identified and suggests that with synthetic speech some level of recognition is possible, giving the imputes for more research in finding ways to increase this accuracy. en_US
dc.language.iso en en_US
dc.publisher Addis Ababa University en_US
dc.subject Recognition System en_US
dc.subject Amharic Speech en_US
dc.title Synthetic Speech Trained - Large Vocabulary Amharic Speech Recognition System (SST-LVASR) en_US
dc.type Thesis en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AAU-ETD


My Account