Natural Language Based Semantic Question Answering Over Linked Data for Amharic Language

No Thumbnail Available



Journal Title

Journal ISSN

Volume Title


Addis Ababa University


As an enormous amount of structured data has been produced on the Web and available on online data portals in Amharic language, intuitive ways of accessing this data has become more and more important. Therefore, some question answering approaches have been proposed for other languages by researchers so far. However, as these approaches are language specific, they are not capable enough to capture grammar construction and statement formation of the Amharic language. On the other side, various researches have been proposed to retrieve for information from large repositories of Amharic text documents via using keyword-based search and semantic-based search. But they have lack of delivering direct information to the user; instead, they retrieve documents containing the needed information which user must scan to get information. In this research, an effort has been made to design a new approach that allows the user to formulate a question in Amharic natural language using their own terminology to which they receive direct answers. Word embedding, Data indexing, Query template generation, Resource matching, and disambiguation, and Query ranking and execution are core components of the approach. Word Embedding component is responsible to construct vector representation of words based on the statistical distribution of words co-occurrence in an Amharic text corpus. Data indexing is intended to build indices for the purpose of speeding up the resource matching. Query template generation is responsible to interprets user query using the neural based semantic parser and generates the corresponding domain independent query template. Resource matching and disambiguation is intended to grounding domain independent query template to a given linked dataset through matching resources and disambiguating datasets to produce domains specific queries. This component produces several possible query templates which they are ranked and the top-ranked query is selected to retrieve answers via Query ranking and execution The approach is evaluated using test data benchmark on Amharic linked dataset. The benchmark encloses 50 Amharic questions annotated with corresponding query templates and answers. It achieved average recall of 0.58, average precision of 0.43, and average f-measure of 0.50.



Semantic Querying, Question Answering, Neural Word Embedding, Neural Semantic Parser, Semantic Matching, Resource Disambiguation, Template Generator