Speaker Dependent Speech Recognition for Sideman Language

Kemal, Abdella

Speaker Dependent Speech Recognition for Sideman Language

Files

Abdella Kemal.pdf (817.55 KB)

Date

2010-07

Authors

Kemal, Abdella

Publisher

Addis Ababa University

Abstract

Speech recognition systems have been applicable in wide areas as various speech recognition methodologies, techniques and tools have been developed and implemented to generate a natural and intelligible speech. The main objective of this thesis is to explore the possibility of developing prototype speaker dependent speech recognition for Sidaama language using Hidden Markov Model(HMM). In order to come up with a working prototype model for the language extensive study was conducted on the language to understand and come up with the language features needed to build the model. Additionally, the components as well as techniques used in the HMM based speech recognition design were studied and analyzed to identify those components that are dependent on the characteristics of the language. Besides the most commonly used speech recognition tools were critically reviewed and as a result the most widely used Java based speech recognizer tool called the Sphinx Systems was used to build the acoustic models as well as for testing the recognition performance. This research attempted to build context dependent triphone based isolated speech recognizer as well as context independent monophone based isolated word speech recognizer models for Sidaama language. A total of 450 unique words were selected and recorded in consultation with a domain linguistic expert. Out of the total datasets, 300 of the recorded words were used for training the acoustic models whereas the remaining 150 words were used for testing the performances of the constructed acoustic models. In addition out of the 300 words used for training the HMM acoustic model 100 words were randomly selected and used for testing constructed models. The performance of the context dependent triphone based model achieved 73% accuracy for 100 words selected among 300 words used for building the acoustic models whereas 68% accuracy is obtained using 150 words which were not included in building the recognizer model. Similarly, the context independent word based model achieved 69% accuracy for 100 words selected among the 300 words used for building the context dependent acoustic model where as 58% accuracy was achieved using 150 words which were not used for building the acoustic model. As a result the context dependent triphone based model is suggested to be appropriate for building speech recognizer for Sidaama language. In conclusion the results obtained were encouraging and more optimization works should be done in the future to improve the recognition performance. Keywords: Automatic Speech Recognition, Sidaama Language, Sphinx System, HMM

Keywords

Automatic Speech Recognition, Sidaama Language, Sphinx System, HMM

URI

http://etd.aau.edu.et/handle/123456789/14453

Collections

Health Informatics

Full item page

Speaker Dependent Speech Recognition for Sideman Language

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections