Designing Amharic Definitive Question Answering

No Thumbnail Available



Journal Title

Journal ISSN

Volume Title


Addis Ababa University


The amount of available information is becoming very huge, especially with the Web proliferation. The problem faced by the user is not the lack of documents or information but is the lack of time to find a short and precise answer among the variety of available documents. Search engines offer a lot of links toward web pages, but are not able to provide an exact answer; instead return ranked documents based on relevance measure with the posed query from users. Thus, a new need is emerged: the possibility of obtaining a brief and concise answer. Providing a brief and concise answer is the main goal of Question Answering systems. Though there are studies towards developing question – answering system for factoid questions, there is no research conducted to develop a definition question answering system for Amharic and we couldn’t compare our result with any other efforts in the topic. In this study, an attempt is made to design Amharic question answering for definitive questions. Definition QA systems in other languages have been extensively researched and have shown reasonable outcomes. The proposed Question Answering approach in this study deals with Amharic definition question by applying surface text pattern method. This method considers two main steps. First, it applies a pattern to discover a set of definition-related text patterns from the Amharic legal corpus. Then, using these patterns, it extracts a collection of concept-description pairs from a target document file, and applies the definition extraction to return answer to a given question. The research achieved nugget precision of 85.6 %, nugget recall of 73% and F-measure of 78.8%. Usage of surface patterns is effective to answer Amharic definition questions. Definiendum extraction from users question and extracting concept-descriptions from corpora are the major challenges of this study. Further sequence mining algorithm can be experimented to extract concept-description relationship from the corpus.



available information is becoming very huge