Attention-based Amharic-to-Wolaita Neural Machine Translation Workineh Wogaso Gaga
No Thumbnail Available
Date
2020-10-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Natural language (NL) is one of the fundamental aspects of human behaviors and we can communicate all over the world through it. Natural Language Processing (NLP) is a branch of artificial intelligence that deals with the interaction between computers and humans using the NL. In the NLP world, every NL should be well understood by the machine. Machine Translation (MT) is a process by which computer software understand NL and an automatic translation of a text from one NL (source language) to another (target language). Neural Machine Translation (NMT) is a recently proposed approach to MT and has been able to gain State of The Art (SOTA) translation quality for the last few years. Unlike the traditional MT approaches, NMT aims at building a single neural network that can be jointly tuned to maximize translation performance. In this thesis we proposed Attention-based Amharic-to-Wolaita NMT. We built our system based on Encoder-Decoder architecture also called Sequence to Sequence (Seq2Seq) model by integrating a Recurrent Neural Network (RNN) and Gated Recurrent Unit (GRU). For comparison of our translation performance, we developed non-attention-based Amharic-to-Wolaita NMT. An encoder in basic (non-attention-based) Encoder-Decoder architecture encodes the complete information of the source (Amharic) sequence into a single real-valued vector called context vector which is passed to the Decoder to produce an output (Wolaita) sequence. Here, a context vector summarizes the entire input sequence into a single vector. As the length of the sentence increases, the inter-dependency of words is loosely related and it is a major drawback. The second problem of basic Encoder-Decoder model is handling of a large number of vocabulary sizes. As each word in the sentence is visited, it must be assigned a new identity number. But when the length of the corpus increases, the number used for word representation and dimension of word vector needed becomes higher. These two basic issues are solved using an attention mechanism. However, either attention-based or non-attention based NMT have not been developed for Amharic-to-Wolaita language pair. Thus, we developed attention-based NMT for Amharic-to-Wolaita translation and compare it against a non-attention-based NMT system. We used a BLEU score evaluation technique for system evaluation and we got 0.5960 BLEU score for non-attention-based system and 0.6258 BLEU score for attention-based system. Thus, attention-based system obtains up to +0.02978 BLEU improvements over non-attention-based NMT system.
Description
Keywords
Natural Language Processing, Machine Translation, Neural Machine Translation, Recurrent Neural Network, Language Modelling, Attention-Based Encoder-Decoder Model