Amharic Question Classification System Using Deep Learning Approach

Habtamu, Saron

Amharic Question Classification System Using Deep Learning Approach

dc.contributor.advisor	Assabie, Yaregal (PhD)
dc.contributor.author	Habtamu, Saron
dc.date.accessioned	2021-08-03T11:50:52Z
dc.date.accessioned	2023-11-04T12:23:13Z
dc.date.available	2021-08-03T11:50:52Z
dc.date.available	2023-11-04T12:23:13Z
dc.date.issued	4/14/2021
dc.description.abstract	Questions are used in different applications such as Question Answering (QA), Dialog System (DS), and Information Retrieval (IR). However, some questions might be too complex to be analyzed and processed. As a result, systems are expected to have a good feature extraction and analysis mechanism to linguistically understand these questions. The retrieval of wrong answers, inaccuracy of IR, and crowding the search space with irrelevant candidate answers are some of the challenges that are caused due to the inability to appropriately process and analyze questions. Question Classification (QC) aims to solve this issue by extracting the relevant features from the questions and by assigning them to the correct class category. Even though QC has been studied for various languages, it was hardly studied for the Amharic language. This research studies Amharic QC focusing on designing hierarchical question taxonomy, preparing Amharic question dataset by labeling the sample questions into their respective classes, and implementing Amharic QC (AQC) model using Convolutional Neural Network (CNN) which is part of the DL approach. The AQC uses a multilabel question taxonomy that integrates coarse and fine grain categories. This multilabel class helps us to be more accurate in retrieving answers compared to the flat taxonomy. We constructed the taxonomy by analyzing our AQ dataset and also adopting the standard taxonomies that were previously studied. We have prepared the AQs in three forms: Surface, Stemmed, and Lemmatised forms. We train and test these datasets using a word vectorizer trained on surface words noticing that most interrogative words appear to be similar even when they are stemmed and lemmatized. As a result, we have achieved 97% and 90% training and validation accuracy for Surface AQs. Scoring 40% for the stemmed AQs. However, the word2vec model could not represent the lemmatized AQs appropriately. As a result, no results were obtained during training. we also tried to extract features from AQs by using different filters separately. This gave us an accuracy of 86% while requiring an increasing number of training epochs.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/27559
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Amharic Question Classification	en_US
dc.subject	Deep Learning	en_US
dc.subject	Cnn, Fine Grain	en_US
dc.subject	Coarse Grain Hierarchical Taxonomies	en_US
dc.subject	Word2vec	en_US
dc.title	Amharic Question Classification System Using Deep Learning Approach	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Saron Habtamu 2021.pdf
Size:: 1.19 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science