Discovering Words Association to Enhance the Effectiveness of Amharic Information Retrieval System

No Thumbnail Available

Date

2017-02-01

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

Given the tremendous growth rate of the World Wide Web, it is increasingly necessary to develop appropriate tools that can assist users looking for information. The techniques used by search engines are based on statistical and syntactic analysis, and do not consider the semantics contained in the user's request and the available documents. However, solving user problem is often not interested in information similar to the query, but rather information that bears an interesting relation to the query. The study attempted to understand the user query with the help of Semantic association expand to a user’s query. Using the design science research methodology we developed a prototype. Amharic local news articles available in electronic form are used as a source of data. The system extracts the keywords using “term frequency inverse document frequency (TF-IDF) weighting scheme” and then determines the associations among terms using “apriori algorithm”. We have evaluated our system by comparing its results with the results before query expansion. Search results are evaluated with precision, recall and F-measure. The experiment results show that the performance of the system improved by 17% in terms of recall and 7.5% F-measure system improvement, while precision was declined 2.8%. Search engines can perform quite remarkably result by retrieving what users need directly. In the analysis, it is found that the semantic association based model intend to retrieve more relevant document enhancing recall of the Amharic information retrieval system. The terms retrieved may not be relevant to user queries, but they have strong associations. This was the main challenge for word association based retrieval system. We recommended integrating mechanisms of controlling similarity terms in the association-based retrieval model to enhance precision.

Description

Keywords

Discovering Words

Citation