Speech to Text Conversion Using Amharic Characters

Tsegaye, Nebiyou

Speech to Text Conversion Using Amharic Characters

Files

Nebiyou tsegaye.pdf (764.99 KB)

Date

2005-12

Authors

Tsegaye, Nebiyou

Publisher

Addis Ababa University

Abstract

Spoken language is the primary method of human to human communication. This communication by spoken language is now extended by use of technologies such as telephony, radio, etc. These technological advancements reflect that spoken communication is the preferred method in human psychology. Spoken language is also a preferred method of human-machine interaction. A spoken language system needs to have both speech recognition and speech synthesis capabilities. But this thesis is about building only the speech recognition (Speech to Text) system, specifically for Amharic language. Amharic language has more than 200 characters but the standard keyboard is made for English alphabet. This limited number of keys has imposed the need of 2 – 4 key strokes to write a single Amharic letter. The practical project of this thesis is to develop functional software with speech to text capabilities for Amharic language. But this software by no means covers all Ethiopic characters. The algorithms and models developed will be experimented on small part of the Ethiopic characters with minimal error rate as possible. There are different approaches to speech recognition. But the statistical approach to speech recognition seems to be industries current favorite, as it delivers better performance. It is also easier to implement. So the statistical approach is used in the development of the software. This approach requires acoustic models and language models to be built. Acoustic model refer to representation of knowledge about acoustics, phonetics, etc whereas Language viii model refers to system knowledge of what constitutes a possible word, what words likely to co-occur and in what sequence. This thesis is an attempt to build STT conversion for Amharic language using the statistical approach. So the inventory of speech files is made by recording and from these data appropriate models are built. The purpose is to test the performance based on the models built and prove that statistical models are suited to modeling speech signals.

Keywords

communication, primary method

URI

http://etd.aau.edu.et/handle/123456789/3565

Collections

Computer Engineering

Full item page

Speech to Text Conversion Using Amharic Characters

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections