Speech to Text Conversion Using Amharic Characters
No Thumbnail Available
Date
2005-12
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Spoken language is the primary method of human to human communication.
This communication by spoken language is now extended by use of
technologies such as telephony, radio, etc. These technological advancements
reflect that spoken communication is the preferred method in human
psychology.
Spoken language is also a preferred method of human-machine interaction. A
spoken language system needs to have both speech recognition and speech
synthesis capabilities. But this thesis is about building only the speech
recognition (Speech to Text) system, specifically for Amharic language.
Amharic language has more than 200 characters but the standard keyboard is
made for English alphabet. This limited number of keys has imposed the need of
2 – 4 key strokes to write a single Amharic letter.
The practical project of this thesis is to develop functional software with speech
to text capabilities for Amharic language. But this software by no means covers
all Ethiopic characters. The algorithms and models developed will be
experimented on small part of the Ethiopic characters with minimal error rate as
possible.
There are different approaches to speech recognition. But the statistical
approach to speech recognition seems to be industries current favorite, as it
delivers better performance. It is also easier to implement. So the statistical
approach is used in the development of the software. This approach requires
acoustic models and language models to be built. Acoustic model refer to
representation of knowledge about acoustics, phonetics, etc whereas Language
viii
model refers to system knowledge of what constitutes a possible word, what
words likely to co-occur and in what sequence.
This thesis is an attempt to build STT conversion for Amharic language using
the statistical approach. So the inventory of speech files is made by recording
and from these data appropriate models are built. The purpose is to test the
performance based on the models built and prove that statistical models are
suited to modeling speech signals.
Description
Keywords
communication, primary method