Geez To Amharic Automatic Machine Translation: A Statistical Approach

dc.contributor.advisorYifiru (PhD), Martha
dc.contributor.authorMulugeta, Dawit
dc.date.accessioned2018-11-10T15:59:05Z
dc.date.accessioned2023-11-18T12:43:52Z
dc.date.available2018-11-10T15:59:05Z
dc.date.available2023-11-18T12:43:52Z
dc.date.issued2015-05-05
dc.description.abstractMachine Translation (MT) is the task of automatically translating a text from one natural language to another. MT is essential for many applications including multilingual information retrieval, speech to speech and others. The theme of this thesis is Geez to Amharic MT based on statistical approach which addresses the problem of automatically translating Geez text to Amharic text. Geez is classical South Semitic language which is attested in many inscriptions including historic, medical, religious and other since the early 4th century. Today Geez remains only as a spoken language and the liturgy language of the Ethiopian Orthodox Tewahedo Church. Whereas, Amharic is among the most spoken language in Ethiopia and the official working language of the Federal Government of Ethiopia, where it has about 30 million native and non-native speakers. The machine translation of Geez document to Amharic will be of paramount importance in order to enable Amharic user to easily access the invaluable indigenous knowledge decoded in Geez language. Therefore, the thesis is focused on investigating the application of corpus based machine translation approach in order to translate Geez documents to Amharic. The method that is employed to conduct the experimentation is a Statistical Machine Translation (SMT) approach. This approach requires availability of a large volume of parallel documents prepared in Geez and Amharic. The experiment was conducted using Moses (statistical Machine Translation tool), GIZA++ word alignment toolkit and IRSTLM language modeling tools on 12, 840 parallel bilingual sentences and an average translation accuracy of BLUE score 8.26 was achieved on 10-fold cross validation experimentation. With the use sufficiently large parallel Geez-Amharic corpus collection and language synthesizing tool, it is possible to develop a better translation system for the language pairs.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/14133
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectGeez To Amharic ;A Statistical Approachen_US
dc.titleGeez To Amharic Automatic Machine Translation: A Statistical Approachen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Dawit Mulugeta.pdf
Size:
888.39 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: