Semantic Relation Extraction for Amharic Text Using Deep Learning Approach
No Thumbnail Available
Date
2020-10-22
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Relation extraction is an important semantic processing task in the field of natural language
processing. The task of relation extraction can be defined as follows. Given a sentence S with
a pair of annotated entities e1 and e2, the task is to identify the semantic relation between e1
and e2 following a set of predefined relation types. Semantic relation extraction can support
many applications such as text mining, question answering, information extraction, etc. Some
state-of-the-art systems in foreign languages still rely on lexical resources such as WordNet
and natural language processing tools such as dependency parser and named entity recognizers
to get high-level features. Another challenge is that important information can appear at any
position in the sentence. To tackle these problems, we propose Amharic semantic relation
extraction system using a deep learning approach. From the existing deep learning approaches,
the bidirectional long short-term memory network with attention mechanism is used. It
enables multi-level automatic feature representation learning from data and captures the most
important semantic information in a sentence.
The proposed model contains different components. The first is a word embedding that maps
each word into a low dimension vector. It is a feature learning techniques to obtain new
features across domains for relation extraction in Amharic text. The second is BLSTM that
helps to get high-level features from embedding layer by exploiting information from both the
past and the future direction. The single direction of relation may not reflect all information
in context. The third is attention mechanism that produces a weight vector, and merges wordlevel
features from each time step into a sentence-level feature vector, by multiplying the
weight vector.
To evaluate our model, we conduct experiments on Amharic-RE-Dataset, which is prepared
from Amharic text for this thesis. The commonly used evaluation techniques precision, recall,
and F-score are used to measure the effectiveness of the proposed system. The proposed
attention based bidirectional long short term memory model yields an F1- score of 87.06%. It
performs good result with only word embedding as input features, without using lexical
resources or NLP systems.
Description
Keywords
Amharic Text Semantic Relation Extraction, Deep Learning, Word Embedding, Attention Based Bi Directional Long Short Term Memory