AAU Institutional Repository

Geez to Amharic Machine Translation

Show simple item record

dc.contributor.advisor Libise, Mulugeta (PhD)
dc.contributor.author Abel, Biruk
dc.date.accessioned 2019-08-19T10:20:38Z
dc.date.available 2019-08-19T10:20:38Z
dc.date.issued 2018-05-05
dc.identifier.uri http://localhost:80/xmlui/handle/123456789/18800
dc.description.abstract Natural Language Processing (NLP) is defined as ways for computers to analyze, understand, and derive meaning from human language in a smart and useful way. Machine Translation (MT) is one of the applications of NLP. It is the use of computers to translate from one natural language like Geez to another say Amharic. Natural languages may follow different word ordering during sentence formation for example Geez follows Subject + verb + object (SVO) and Verb + subject + object (VSO) while Amharic only follows SOV so alignment of the right Geez word with the Amharic word is of paramount importance to improve the translation quality. The purpose of this study to develop a Hybrid Geez to Amharic Machine Translation system using serial coupling of rule based Geez language word reordering followed by a standard Statistical Machine Translation (SMT) system. The proposed system is composed of two main components a Rule Based Geez Corpus Preprocessor and a Baseline SMT. The Rule Based Preprocessor takes the manually Part of Speech (POS) tagged Geez corpus and produces another corpus that contains reordered Geez sentences having similar structure with that of Amharic sentences. This component contains set of activities that process each Geez sentence in the input corpus one by one to determine POS pattern and subsequently apply the corresponding reordering rule. It first reads all sentences from the input file and iterates through all sentences and it first determines POS pattern and applies the corresponding reordering rule. After each sentence is processed the output corpus along with the Amharic corpus will be supplied as an input to the Baseline SMT. Then using the input corpora the actual translation of Geez sentence to Amharic sentences will be performed by the Decoder of the Baseline SMT by using the Language model of Amharic and Translation model. The translation quality of the proposed system is evaluated using BLEU evaluation metrics and compared with that of the Baseline SMT. Two experiments were conducted one to test the Baseline SMT and the other to test the proposed system. To test the Baseline SMT both Geez and Amharic corpus without POS were used while to test the proposed system Geez corpus with POS and Amharic corpus with no POS were used. Based on the test results the Baseline SMT scored a BLEU of 72% and the proposed system outscores it by 4% and scored 76% owing to the reordering rules applied on Geez corpus. en_US
dc.language.iso en en_US
dc.publisher Addis Ababa University en_US
dc.subject Geez to Amharic Machine Translation en_US
dc.subject Hybrid Machine Translation en_US
dc.subject Rule Based Word Reordering en_US
dc.subject Statistical Machine Translation en_US
dc.subject Part of Speech Tagging en_US
dc.title Geez to Amharic Machine Translation en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AAU-ETD


Browse

My Account