Addis Ababa University


The Web is a repository of huge amount of information among other sources of information used in the day-to-day activities of human being. Moreover, this information may be presented in different languages. Retrieving information from the Web requires the presence of search engines. There are general purpose search engines like Google, Yahoo, and MSN. These general purpose search engines are mainly designed for English language. Shortcomings of these search engines are reflected when they are applied to non-English languages such as Afaan Oromo as they lack specific characteristics of such languages. This research work came up with design and prototype of a search engine for Afaan Oromo texts. The search engine mainly consists of three components – crawler, indexer, and query engine that are optimized for Afaan Oromo. The crawler downloads documents and then filtering of these documents for Afaan Oromo is done by the categorizer subcomponent of the crawler. Next, documents that are identified as Afaan Oromo are preprocessed and stored in an index for later retrieval. Finally, queries supplied in an interface to the query engine component are preprocessed, checked for a match in the index, and matched documents are displayed through an interface in a ranked order. Performance evaluation of the search engine is conducted using selected set of documents and queries. According to precision-recall measures employed, 76% precision on the top 10 results and an average precision of 93% are obtained. Experiment on some specific features of the language against the design requirements is also made. Key words: Information Retrieval, Search Engine, Categorizer, Afaan Oromo



