Prosody Based Authomatic Speech Segmentation for Amharic

Mekonen, Rahel

Prosody Based Authomatic Speech Segmentation for Amharic

dc.contributor.advisor	Teferra, Solomon (PhD)
dc.contributor.author	Mekonen, Rahel
dc.date.accessioned	2019-05-08T13:10:45Z
dc.date.accessioned	2023-11-18T12:47:26Z
dc.date.available	2019-05-08T13:10:45Z
dc.date.available	2023-11-18T12:47:26Z
dc.date.issued	2019-02-05
dc.description.abstract	Many speech processing systems require segmentation of speech waveform into principal acoustic units. Speech segmentation is the process of identifying the boundaries between paragraph, sentence, words, syllables, and phonemes in spoken natural languages. It is the very primary step in the field of speech technologies. Automatic speech segmentation is a process segment any one of discrete units that occur in a continuous speech signal through algorithms developed for this purpose. Speech segmentation is a challenging task because the cues present for segmenting text are absent in a continuous speech. The main goal of this work is to develop sentence level automatic speech segmentation system for Amharic. Sentence segmentation is a process of identifying the end of a sentence. In this study, sentence segmentation system is implemented in to two approaches. In the first approach, we used an automatic tool for segmenting and labeling of Amharic speech data. Acoustic model is created using speech and their text scripts and compiling them into a statistical representation of sounds which makeup words. This is done through HMM modeling. The approach one automatic speech segmentation system is done by forced alignment. In this approach we used rule-based and AdaBoost to discriminate the true boundaries from false. In the second approach, we extracted prosodic features directly from speech waveform and also statistical method, AdaBoost, is used. The evaluation of the experiments shows that monosyllable acoustic model is the better model to get accurate forced alignment than monophone and tide state tri-syllable model. And also adaboost classifier showed consistently good results especially in decision tree classifier. In all experiment read-aloud speech perform higher accuracy than spontaneous speech. It also indicates that spontaneous speech is more difficult than read-aloud because, the spontaneous speech contains more noise and disfluencies. The evaluation in phase two indicates that pause feature is a basic discriminator for Amharic sentence boundary. And also when prosodic features are introduced, the performance is increased. The scope of the research work is narrowed down only to sentences level segmentation. It is also required to conduct a research on automatic speech segmentation of other discrete units.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/12345678/18214
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Sentence Segmentation	en_US
dc.subject	Acoustic Model	en_US
dc.subject	Prosody	en_US
dc.title	Prosody Based Authomatic Speech Segmentation for Amharic	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Rahel Mekonen 2019.pdf
Size:: 1.77 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Information Sciences