Context Based Machine Translation With Recurrent Neural Network for English - Amharic Translation

No Thumbnail Available

Date

2020-02

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

The quote from Rev. Jesse Jackson, \A text without a context is a pretext", summarizes the reasoning behind this thesis. Capturing context in translating between two human languages using computing machines is challenging. It is more challenging when the languages di er greatly in grammar and have small parallel corpus like the English-Amharic pair. The current approaches for English-Amharic machine translation usually require large set of parallel corpus in order to achieve uency as in the case of statistical machine translation (SMT) and example based machine translation (EBMT). The context awareness of phrase based machine translation (PBMT) approaches used for the pair so far are also questionable. This research develops a system that translates English text to Amharic text using a combination of context based machine translation (CBMT) and a recurrent neural network machine translation (RNNMT). We built a bilingual dictionary for the CBMT system to use along with a target corpus. The RNNMT model has then been provided with the output of the CBMT and a parallel corpus for training. The proposed approach is evaluated using the New Testament Bible as a corpus. The result shows that the combinational approach on English-Amharic language pair yields a performance improvement of 2.805 BLEU scores on average over basic neural machine translation(NMT).

Description

Keywords

Machine Translation, context based machine translation, English to Amharic translation, recurrent neural network machine translation, context based machine translation with neural network machine translation

Citation