Concatenative Text-To-Speech System for Afaan Oromo Language

Taddese Samson

Concatenative Text-To-Speech System for Afaan Oromo Language

dc.contributor.advisor	G/ Egziabher Mullugeta (PhD)
dc.contributor.author	Taddese Samson
dc.date.accessioned	2018-11-29T14:19:03Z
dc.date.accessioned	2023-11-29T04:57:00Z
dc.date.available	2018-11-29T14:19:03Z
dc.date.available	2023-11-29T04:57:00Z
dc.date.issued	2011-05
dc.description.abstract	This paper explores the possibility of developing a concatenative TTS system for Afaan Oromo language where diphone and triphones are the speech units that are focused on. Nowadays, concatenative method is used in most modern TTS systems to produce synthesized speech. But in concatenative method, selecting an appropriate unit for creating a database is a challenging task. In the proposed approach, such database is created with different sizes of speech units and is used to produce speech utterances which include diphones and triphones. For the synthesis process, diphones and triphones which are smaller speech units are used to achieve unlimited vocabulary of speech. During the process, a diphone database consisting of 800 entries and a triphone database with entries 1982 is constructed. The synthesizer is then evaluated for its performance measure, naturalness and intelligence by six individuals from the language domain. The experimental results show that 75% and 54% of words in the data set are correctly pronounced as to the diphone and triphone speech units, respectively. The MOS levels of the intelligence of the system also showed that a 3.03 and 2.2 scale levels were achieved for the diphone and triphones respectively; whereas the naturalness of the system was 2.65 and 2.02 for each speech units respectively. The removal of many triphone speech unitsthat can increase the time complexity of the system and those that don’t represent the language can be mentioned as the main reason behind the low result of the triphones as compared to the diphones. In fact, the values gained for the triphones has shown an increase from 2.2 to 2.23 and then to 2.27 for the measured systems intelligence and from 2.02 to 2.05 then to 2.08 for naturalness of the system when some of the removed entries are added to the database. The result obtained indeed is a promising result; for which accordingly, future research directions are proposed to improve the performance of the system. Key words: Speech Synthesis, Concatenative methods, Festival, Afaan Oromo	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/14708
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Speech Synthesis	en_US
dc.subject	Concatenative methods	en_US
dc.subject	Festival	en_US
dc.subject	Afaan Oromo	en_US
dc.title	Concatenative Text-To-Speech System for Afaan Oromo Language	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Samson Taddese.pdf
Size:: 809.51 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Health Informatics