Text-to-Speech system for Afaan Oromoo
No Thumbnail Available
Date
2001-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
A natural way of communication between humans is through speech. On the contrary human machine
communication has been limited to keying in instructions and receiving answers
through text forms. This limitation of human-machine communication is now being solved by
the development of Dialogue systems. These systems enable human-computer communication
via speech. Besides having natural way of communication.
One component of a Dialogue system is a Text-to-speech synthesizer (TTS). It reads texts
aloud. This component gives opportunities to handicapped people to have access to electronic
documents. Moreover it can be used for proofreading documents, language education etc.
In this study an attempt is made to address the issue of having textual information in speech
forms for the language of Araan In doing so a dip hone based text to speech system
for Afaan Oromoo sample words is presented. The present prototype system consists of two
main parts. These are
• an automatic phonetic transcription of the input word and
• a speech synths is module which synthesizes an utterance by looping (concatenating)
the sound equivalents of the phonetic transcription
To test the algorithm, samples of words are selected. The selection was based on prevIOUS
study result that showed the most frequent words in some A/aan Or011100 texts.
Di phones, speech units that cover two sounds and the transition between them, form the basis ..... .
of the synthesis module. In transcribing the orthography (the writing system) into phonetic
units, the Aji:tan Orol11oo writing system is found to be well governed to rules. This enabled
the transcription to be accurate. On the contrary, success on recogni zing the utterance of the
tran scribed phonetic unit was only li mited to 43.33 % for naive listeners and to 83.33% for
li steners who heard the utterance at least tlu'ee times in different days.
Error rates of 37 % can be found in some Engli sh Text-to-Speech system. The result of this
work is therefore encouraging and open to reaching higher rates of intelligibility with some
improvements. Incorporating spectral smoothing techniques that smooth the transition points
of dip hones can make improvements. Moreover, it is felt that having sound laboratory to
record the corpus data for such kinds of work is mandatory.
Description
Keywords
Information Science