Amharic Connected Word Speech Recognition System for Mobile Phones
No Thumbnail Available
Date
2017-10-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
This study investigated ways of integrating an Amharic Automatic speech recognition (ASR) system on mobile phones and demonstrated the possibility of using a connected word Amharic ASR system that can be used to command and control mobile phone devices – devices which currently have become not only a valuable means of communication, but also a common computing device that we carry everywhere.
To this end, CMUSphinx’s Pocketsphinx is used to build a connected word, offline, small vocabulary and speaker dependent Amharic ASR system. For the fact that CMUSphinx system does not support whole word models rather word-dependent phone models, in this research work, two distinct acoustic models namely, word-dependent phone (WDP) and word-dependent CV syllable (WDCVS) models are built. And a total of 36 words are used in both models to recognize the Amharic digits from 0 to 100 and a limited set of command phrases, spoken in a connected manner. A prototype Android application is also developed and used to integrate the developed acoustic models with an Android phone. To model the sequence of words acceptable in the prototype application, three different context free grammar files are created and used.
The approach used in implementing the recognizer is embedded speech recognition, or simply ESR, using the Pocketsphinx tool – an open source speech recognition toolkit optimized to run on embedded devices. The methodology employed to develop the recognizer is the most popular statistical model which is the Hidden Markov Model.
In this study, two categories of performance evaluations are carried out. The first, Category I, is conducted to determine the effect the three model types i.e. continuous, semi-continuous and PTM, have on the two models i.e. WDP and WDCVS, and the second, Category II, is conducted to determine the best of the two models on a mobile phone using the prototype application. Hence, in the second category of evaluations, the best recognition accuracies found for the two models are, 98.07% for WDCVS model and 97.58% % for the WDP model.
Thus, according to the results gained, it is highly promising to fully deploy Amharic ASR systems to command, control and perform other activities on mobile phones using Amharic language.
Description
Keywords
Amharic Speech recognition on mobile phones, Word-dependent phone modelWord-dependent CV syllable model