Geez To Amharic Automatic Machine Translation: A Statistical Approach
No Thumbnail Available
Date
2015-05-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Machine Translation (MT) is the task of automatically translating a text from one natural language to another. MT is essential for many applications including multilingual information retrieval, speech to speech and others. The theme of this thesis is Geez to Amharic MT based on statistical approach which addresses the problem of automatically translating Geez text to Amharic text. Geez is classical South Semitic language which is attested in many inscriptions including historic, medical, religious and other since the early 4th century. Today Geez remains only as a spoken language and the liturgy language of the Ethiopian Orthodox Tewahedo Church. Whereas, Amharic is among the most spoken language in Ethiopia and the official working language of the Federal Government of Ethiopia, where it has about 30 million native and non-native speakers. The machine translation of Geez document to Amharic will be of paramount importance in order to enable Amharic user to easily access the invaluable indigenous knowledge decoded in Geez language.
Therefore, the thesis is focused on investigating the application of corpus based machine translation approach in order to translate Geez documents to Amharic. The method that is employed to conduct the experimentation is a Statistical Machine Translation (SMT) approach. This approach requires availability of a large volume of parallel documents prepared in Geez and Amharic. The experiment was conducted using Moses (statistical Machine Translation tool), GIZA++ word alignment toolkit and IRSTLM language modeling tools on 12, 840 parallel bilingual sentences and an average translation accuracy of BLUE score 8.26 was achieved on 10-fold cross validation experimentation. With the use sufficiently large parallel Geez-Amharic corpus collection and language synthesizing tool, it is possible to develop a better translation system for the language pairs.
Description
Keywords
Geez To Amharic ;A Statistical Approach