Addis Ababa University Libraries Electronic Thesis and Dissertations: AAU-ETD! >
Institute of Technology >
Thesis - Communication Engineering >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/2412

Title: Synthetic Speech Trained - Large Vocabulary Amharic Speech Recognition System (SST-LVASR)
Authors: Mesfin, Birile Woldetsadik
Advisors: Dr. V.N.V Manoj
Mr. Molalgne Girmaw
Keywords: Trained - Large Vocabulary
Speech Recognition
Copyright: Jul-2008
Date Added: 3-May-2012
Publisher: AAU
Abstract: Amharic is the official language of Ethiopia, which is characterized by very large morphological forms of words. This thesis is an investigation of the possibility of developing an Automatic speech recognition system (ASR) for Amharic using synthesized Amharic speech generated through concatenation of prerecorded morphemes, can be used to train a hidden markov model (HMM) based ASR system. The development of HMM based ASR system requires identification of all possible words and a construction of text and speech corpora containing multiple samples of the words to be recognized by the system. These data are then used as training sets in the development of the models, the final objective being the construction of HMM models for each recognition unit. Since there are a large number of morphological forms for the words in Amharic, the effort of collecting the Amharic words for constructing the text corpus and the recording and labeling of the same words for the speech corpus is extremely difficult. This thesis demonstrates that by developing an automatic morphological expander, the effort of developing the text corpus is reduced to a manageable level. Additionally, a significant reduction in the speech corpus development is achieved by using machine generated speech for training the HMM models of the ASR system. These reductions in the development efforts of the text and speech corpora greatly reduce the most prominent of the obstacles in developing a general purpose Amharic speech recognizer. The 62.37% word accuracy for naturally recorded speech indicates that using synthetic speech for training at least 62% of the words are correctly identified and suggests that with synthetic speech some level of recognition is possible, giving the imputes for more research in finding ways to increase this accuracy.
URI: http://hdl.handle.net/123456789/2412
Appears in:Thesis - Communication Engineering

Files in This Item:

File Description SizeFormat
144078363207925896123331607846476302578840.06 kBAdobe PDFView/Open

Items in the AAUL Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.


  Last updated: May 2010. Copyright © Addis Ababa University Libraries - Feedback