Signal-based Ethiopian Languages Identification using Gaussian Mixture Model
No Thumbnail Available
Date
2017-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. The core problem in solving the language identification (LID) task is to find a way of reducing the complexity of human language such that an automatic algorithm can determine the language and identify it from a relatively brief audio sample. From the review of the existing approaches for LID, it is observed that very few attempts have been made on Language Identification System for African languages. The importance of Language Identification for African languages is seeing a dramatic increase due to the development of telecommunication infrastructure and, as a result, an increase in volumes of data and speech traffic in public networks. By automatically processing the raw speech data the vital assistance given to people in distress can be speeded up, by referring their calls to a person knowledgeable in that language.
An LID system for four different Ethiopian languages namely Amharic, Guragegna, Oromiffa and Tigregna is done using Gaussian mixture models (GMM). The system developed here is intended to identify which language is spoken by the speaker from these four languages audio utterances of some phrases for some duration. A dataset consisted of recording of 7 different speakers for each languages were prepared and after preprocessing the database mono channel, the features are extracted using Mel frequency cepstral coefficients (MFCC) and classification is done using GMMs.
To test the performance of the LID system experimental scenarios are designed and carried out by taking two, three and four languages at a time. The LID system is tested for both utterance dependent and independent system (i.e. the test is done by taking the same speech for both training and testing (utterance/speech dependent) and also by taking different speech than the training utterance (utterance/speech independent)). It is more challenging to implement and get a better LID system performance with utterance independent system with such a small recorded database. In addition to this the system also tested for the speaker independent system. The utterance dependent LID performance for four language tasks was about 93% accurate and the utterance independent LID performance for four language tasks was about 70% accurate on average. The speaker independent LID system performance for the four language task was about 91%.
Keywords: Language identification, Languages, MFCC, GMM, Accuracy, Utterance
Description
Keywords
Language Identification, Languages, Accuracy, Utterance, Mfcc