Tigrigna Question Answering System for Factoid Questions

No Thumbnail Available

Date

6/17/2016

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

Accessing relevant information is one of the major problems faced by Tigrigna language users for every domain of knowledge when dealing with huge amount of information especially in the Internet. Evidently, users are interested in obtaining a specific and precise answer to a specific question. However, obtaining a relevant and concise answer is a challenge to particular user question. For such situation, Tigrigna Question Answering system is a good solution. The proposed QA system comprises of question analysis, document analysis and answer extraction modules. The main function of question analysis module is taking a Tigrigna Question as input and then generates a query, expands a query and determines its Question Particle and Question Type. A statistical language model approach is used to model the classification of Tigrigna questions to their category or type. The document analysis module performs the process of pre-processing of parallel corpora, which are documents that contain question sentences in one document and answer sentences in another one, and also ranking and extracting answer contents. Answer extraction also performs the detail analysis on the retrieved answer contents based on the question type, question particle and query using the techniques of language modeling called Answer Model. This statistical language model does the extraction process of exact and precise Tigrigna answer in probabilistic manner from sets candidate answers. Generally, this system developed after reviewed literatures and related work, and selected the appropriate tools and data source such as Moses, GIZA++ and IRSTLM as tools and different Webs and Tigrigna newspapers and magazines as data sources. Our data sets are classified for training and testing activities of the system. Based on this, we collected around 1000 data sets for training and 200 data sets for testing. Performance evaluation conducted manually by comparing the system‟s answers with the answers exists in testing document, which is prepared for testing purpose. Finally the evaluation results of Tigrigna factoid QAS is expressed in terms of the average performance of a question type classifier which is 87%, and the average Precision, Recall and F – measure of the answer extraction, precision is 88.5%, recall is 85.9% and F – measure is 87.2%. Keywords: Tigrigna question answering, Tigrigna Factoid questions, Language model based question classification, question analysis, Document Analysis, Answer Extraction.

Description

Keywords

Tigrigna Question Answering; Tigrigna Factoid Questions; Language Model Based Question Classification, Question Analysis; Document Analysis; Answer Extraction

Citation

Collections