Modeling Improved Amharic Syllbification Algorithm

No Thumbnail Available

Date

2011-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

In this paper, a rule-based automatic syllabification Algorithm for Amharic language is designed using linguistic implementation notions such as, Maximal Onset and Sonority Hierarchy principles. Amharic is a syllabic language in which every grapheme represents consonant-vowel assimilation. However, while reading a text in Amharic, all the CV syllables are not uttered as expected and hence the syllables in the text are not the CV sequence seen in the grapheme sequence. Epenthesis and gemination is also a major challenge in Amharic grapheme-to-phoneme conversion because of the failure of Amharic orthography to show epenthetic vowel and geminated consonants. This limits the performance of many speech systems (Amharic text-to-speech and speech recognition) and other natural language applications. After a thorough study of the syllable structure, identification of linguistic syllabification rules and a survey of the relevant literature, a set of rules were identified and used to design the algorithms. Prior success rates of rule-based methods applied to different languages for instance, Spanish, Dutch, Italian, Catalan and Sinhala are the basis of this work. Before designing the syllabification algorithm, the epenthesis algorithm is designed. Moreover, the benefit of syllables to assign stress is pointed out. The system was implemented and tested using 1000 carefully selected Amharic words found in the language. The result gave rise to 98.1% word accuracy rate, this result shows rule-based syllabification approach is performing very well and the syllabifier for the language can be rule-driven. Although, comparison with data-driven syllabification approach is not performed in this language, rule-based approach showed a higher accuracy rate in the test set. Key Words: syllabification, rule-based techniques, grapheme-to-phoneme, Maximal Onset Principle, Sonority Hierarchy principles, CV, text-to-speech, speech recognition.

Description

Keywords

Syllabification; Rule-Based Techniques; Grapheme-To-Phoneme; Maximal Onset Principle; Sonority Hierarchy Principles; CV; Text-To-Speech; Speech Recognition

Citation

Collections