Amharic-to-Tigrigna Machine Translation Using Hybrid Approach
No Thumbnail Available
Date
2017-10-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Machine Translation is one of the applications of Natural Language Processing that studies the use of computer software to translate a natural language into another language in the form of text or speech. People use human translation and they tend to be slower as compared to machines. Sometimes it can be hard to get a precise translation that reveals what the text is about without everything being translated word-by-word. In addition, it can be more important to get the result without delay which is hard to accomplish with a human translator. It also leads to unwanted expenses like, time and cost. Thus, this research works on Amharic-to-Tigrigna machine translation system using a hybrid approach i.e. the combination of rule based and statistical approaches to solve the problems. Though, Amharic and Tigrigna are from the same family of language and uses similar sentence structure, they have also difference in constructing various types of phrases. Therefore, the study proposes syntactic reordering approach which aligns the structural arrangement order of words in the source sentence to be more similar to the target sentences. So, reordering rules are developed that fulfils for both simple and complex Amharic sentences that have difference in the structural arrangement order of words. As the researcher knowledge is concerned, there is no prior work conducted on machine translation between Amharic and Tigrigna which is in need to solve this currently. In order to achieve the objective of the study, a corpus is collected from different domain and prepared in a format suitable in the development process and classified as training set and test set. Reordering rules are applied on both the training and testing set in a pre-processing step. One language model is developed, since the system is unidirectional i.e. Amharic-to-Tigrigna. Translation model which assign a probability that a given source language sentence generates target language sentence are built and decoder which searches for the best sequence of translation probability is used. Two major experiments are conducted using two different approaches and their results are recorded. The first experiment is carried out using a statistical approach and the result obtained from the experiment has a BLEU score of 7.02%. The second experiment is carried out using hybrid approach and the result obtained has a BLEU score of 17.47% s. From the result, it can be concluded that the hybrid approach is better than the statistical approach for Amharic-to-Tigrigna machine translation system.
Description
Keywords
Machine Translation, Statistical Machine Translation, Hybrid Machine Translation, Reordering Rule