Skip navigation

Please use this identifier to cite or link to this item:
Title: Speaker Independent, Continuous Speech Recognizer For Kafi Noonoo
???metadata.dc.contributor.*???: Dr. Solomon Teferra
Dr. Binyam Sisay
Zelalem, Asfaw
Keywords: recognizer for Kafi Noonoo
Issue Date: Mar-2013
Publisher: Addis Ababa University
Abstract: In this research we have investigated the possibility of developing Speaker Independent, Continuous Speech recognizer for Kafi Noonoo using Hidden Markov Modeling technique. The portable and open source toolkit called Hidden Markov Model Toolkit is used to perform the experiment. The development of Hidden Markov Model (HMM) based Automatic Speech Recognition (ASR) requires both text and speech corpus for training and testing the HMM. In order to have a model that incorporates different features of the language, we included the different dialects of Kafi Noonoo in the corpus. Following this, we have prepared the training and test corpus from the scratch, and after preprocessing we have sampled and performed feature extraction using MFCC feature extraction techniques. Using the text corpus and the extracted feature vector representation, we have developed speaker independent and speaker dependent recognizers using the recognition units: monophone based context independent and triphone based context dependent. We have analyzed the performance of the developed recognizers using accuracy metrics: word correct rate, memory requirement and speed. The performances of our model tested against by using two groups of speakers: one who is involved both in training and testing and the other who are involved only in testing. We have achieved word recognition accuracy of 60.88% and 46.09% for context dependent triphone based and context independent monophone based speaker independent model respectively and 75.96% and 62.71% word recognition accuracy respectively for context dependent triphone based and context independent monophone based speaker dependent model. Both the systems are similar with regard to their speed and memory requirement. Among the different reasons that lead recognizer’s in reduction of recognition performance, we investigated that pauses longer than 0.5 micro second at the beginning and end of utterance have that own negative impact on recognition and should be critical controlled. Finally, based on the results of the research, the recommendations are drawn and forwarded for future research in the area.
Appears in Collections:Thesis-Computational Science

Files in This Item:
File Description SizeFormat 
Zelalem Asfaw.pdf856.93 kBAdobe PDFView/Open
Show full item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.