AAU Institutional Repository

Afaan Oromo-English Cross-Lingual Information Retrieval (CLIR): a Corpus Based Approach

Show simple item record

dc.contributor.advisor Teferi, Dereje(PhD)
dc.contributor.author Bekele, Daneal
dc.date.accessioned 2018-11-15T09:07:09Z
dc.date.available 2018-11-15T09:07:09Z
dc.date.issued 2011-06
dc.identifier.uri http://etd.aau.edu.et/handle/123456789/14245
dc.description.abstract The goal of Cross Language Information Retrieval (CLIR) is to provide users with access to information that is in a different language from their queries. It has the ability to issue a query in one language and retrieve documents in another. This is achieved by designing a system where a query in one language can be compared with documents in another. Afaan Oromo is one of the major languages that are widely spoken and used in Ethiopia. Despite the fact that Afaan Oromo has a large number of speakers, little effort has been put in conducting researches which aim at making English documents available to Afaan Oromo speakers. This study is, therefore, an attempt to develop Afaan Oromo-English CLIR system which enables Afaan Oromo native speakers to access and retrieve the vast online information sources that are available in English by writing queries using their own (native) language. In this study, the development of a corpus-based CLIR system which makes use of wordbased query translation for Afaan Oromo-English language pairs and evaluation of the system on a corpus of test documents and queries prepared for this purpose is described. This approach requires the availability of parallel documents hence such documents are collected from Bible chapters, legal and some available religious documents. Evaluation of the system is conducted by both monolingual and bilingual retrievals. In the monolingual run, the Afaan Oromo queries are given to the system and Afaan Oromo documents are retrieved while in the bilingual run the Afaan Oromo queries are given to the system after being translated into English to retrieve English documents. For the bilingual run translation of Afaan Oromo queries into their English equivalent is done by using bilingual dictionary constructed from the collected parallel corpora. The performance of the system was measured by recall and precision. In the first phase of the experimentation, the maximum average precision value of 0.421and 0.304 are obtained for the Afaan Oromo and English documents respectively. The second phase of experimentation performs slightly better than the first. Maximum average precision value of 0.468 and 0.316 are obtained for the Afaan Oromo and English documents respectively. Therefore, with the use of large and cleaned parallel Afaan Oromo-English document collections, it is possible to develop CLIR for the language pairs. en_US
dc.language.iso en en_US
dc.publisher Addis Ababa University en_US
dc.subject Information Retrieval en_US
dc.title Afaan Oromo-English Cross-Lingual Information Retrieval (CLIR): a Corpus Based Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AAU-ETD


Browse

My Account