Text-to-Speech system for Afaan Oromoo

No Thumbnail Available

Date

2001-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

A natural way of communication between humans is through speech. On the contrary human machine communication has been limited to keying in instructions and receiving answers through text forms. This limitation of human-machine communication is now being solved by the development of Dialogue systems. These systems enable human-computer communication via speech. Besides having natural way of communication. One component of a Dialogue system is a Text-to-speech synthesizer (TTS). It reads texts aloud. This component gives opportunities to handicapped people to have access to electronic documents. Moreover it can be used for proofreading documents, language education etc. In this study an attempt is made to address the issue of having textual information in speech forms for the language of Araan In doing so a dip hone based text to speech system for Afaan Oromoo sample words is presented. The present prototype system consists of two main parts. These are • an automatic phonetic transcription of the input word and • a speech synths is module which synthesizes an utterance by looping (concatenating) the sound equivalents of the phonetic transcription To test the algorithm, samples of words are selected. The selection was based on prevIOUS study result that showed the most frequent words in some A/aan Or011100 texts. Di phones, speech units that cover two sounds and the transition between them, form the basis ..... . of the synthesis module. In transcribing the orthography (the writing system) into phonetic units, the Aji:tan Orol11oo writing system is found to be well governed to rules. This enabled the transcription to be accurate. On the contrary, success on recogni zing the utterance of the tran scribed phonetic unit was only li mited to 43.33 % for naive listeners and to 83.33% for li steners who heard the utterance at least tlu'ee times in different days. Error rates of 37 % can be found in some Engli sh Text-to-Speech system. The result of this work is therefore encouraging and open to reaching higher rates of intelligibility with some improvements. Incorporating spectral smoothing techniques that smooth the transition points of dip hones can make improvements. Moreover, it is felt that having sound laboratory to record the corpus data for such kinds of work is mandatory.

Description

Keywords

Information Science

Citation