Semantic Role Labeling for Amharic Text Using Deep Learning
No Thumbnail Available
Date
2021-08-17
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Semantic Role Labeling (SRL), the task of automatically finding the semantic roles of each argument corresponding to each predicate in a sentence, is one of the essential problems in the research field of Natural Language Processing (NLP). SRL is a shallow semantic analysis task, and an important intermediate step for many NLP applications, such as Question Answering, Machine Translation, Information Extraction and Text Summarization. Feature-based approaches to SRL are based on parsing output, often using lexical resources, and require heavy feature engineering. Errors encountered in the parsing output can also propagate to the SRL output. Neural-based SRL systems, in contrast, can learn the intermediate representations from raw text, bypassing the manual feature extraction task. Recent SRL studies using Deep Learning have shown improved performance over feature-based systems for the English, Chinese and other languages. Amharic exhibits typical Semitic behaviors that pose challenges to the SRL task, such as, rich morphology, and multiple subject-verb-object word orders. In this work, we approach the problem of SRL for the language using deep learning. The input is raw sentence with words represented using a concatenation of word, character, and fastText-level neural word embeddings to capture the morphological, syntactic and semantic information of the words in sentences, and requires no intermediate feature extraction tasks. We have used a bi-directional Recurrent Neural Network (RNN) with Long-Short Term Memory (LSTM) to capture the bi-directional (for argument identification) and long-range (for argument boundary identification), and a conditional random field with viterbi-decoding to implement the SRL system for the language. The system was trained on 8000 instances and tested on 2000 instances, and achieved an accuracy of 94.96% and F-score of 81.2%. We have manually annotated the sentences with their corresponding semantic roles, and future works can consider improving the quality of the data and experiment feature representations using contextual embeddings for improved performance.
Description
Keywords
Semantic Role Labeler, Deep Learning, Neural Word Embedding, RNN, LSTM