Query Expansion for Tigrigna Information Retrieval

Addis Ababa University


This research has been prepared to enhance the precision and recall of Tigrigna IR system by integrating query expansion mechanism. Query expansion is an effective mechanism to control the effect of polysemous and synonymous nature of query terms. The main reason for integrating query expansion is to increase retrieval of relevant documents as per user’s query based on the correct sense of query terms. This study has a way to discriminate the various meanings of a polysemous term, based on word sense disambiguation (WSD) and find synonymous terms for reformulating user’s query. The proposed algorithm determines the senses of synonymous and polysemous words in user’s query using Tigrigna WordNet. In this study, we experiment root form Tigrigna WordNet and Tigrigna morphological analysis in IR for the first time. Using the idea of N-gram model, word sense disambiguation is performed by comparing the existence of ambiguous query terms, associated with its synsets and related word using reference to Tigrigna WordNet. The notion of WSD is to identify the correct sense of ambiguous terms in user’s query and select the synonyms of the word. Then the selected synonyms of the ambiguous query term added to reformulate the original users query and the modified query will be used for searching of final result. The experimental result of this research gains in two different way, first prior IR system tested with morphological analysis instead of stemmer and second this IR system test by integrating query expansion model. The experiment shows encouraging result, the method of using morphological analysis before query expansion register a performance of 9%precision and 1.6 % recall, expanding query using synset expansion register an improvement of 12% precision and 4% recall on the overall performance. The number of words related to each polysemy terms is limited because of the lack of resource. Therefore, the uses of query expansion terms are limited to the information available on the WordNet.



Word Sense Disambiguation, Wordnet, N-Gram, Information Retrieval