Afaan Oromo-English Cross-Lingual Information Retrieval (Clir): A Corpus Based Approach

Bekele, Daniel

Afaan Oromo-English Cross-Lingual Information Retrieval (Clir): A Corpus Based Approach

dc.contributor.advisor	Teferi, Dereje (PhD)
dc.contributor.author	Bekele, Daniel
dc.date.accessioned	2018-11-26T12:48:41Z
dc.date.accessioned	2023-11-29T04:56:47Z
dc.date.available	2018-11-26T12:48:41Z
dc.date.available	2023-11-29T04:56:47Z
dc.date.issued	2011-06
dc.description.abstract	The goal of Cross Language Information Retrieval (CLIR) is to provide users with access to information that is in a different language from their queries. It has the ability to issue a query in one language and retrieve documents in another. This is achieved by designing a system where a query in one language can be compared with documents in another. Afaan Oromo is one of the major languages that are widely spoken and used in Ethiopia. Despite the fact that Afaan Oromo has a large number of speakers, little effort has been put in conducting researches which aim at making English documents available to Afaan Oromo speakers. This study is, therefore, an attempt to develop Afaan Oromo-English CLIR system which enables Afaan Oromo native speakers to access and retrieve the vast online information sources that are available in English by writing queries using their own (native) language. In this study, the development of a corpus-based CLIR system which makes use of wordbased query translation for Afaan Oromo-English language pairs and evaluation of the system on a corpus of test documents and queries prepared for this purpose is described. This approach requires the availability of parallel documents hence such documents are collected from Bible chapters, legal and some available religious documents. Evaluation of the system is conducted by both monolingual and bilingual retrievals. In the monolingual run, the Afaan Oromo queries are given to the system and Afaan Oromo documents are retrieved while in the bilingual run the Afaan Oromo queries are given to the system after being translated into English to retrieve English documents. For the bilingual run translation of Afaan Oromo queries into their English equivalent is done by using bilingual dictionary constructed from the collected parallel corpora. The performance of the system was measured by recall and precision. In the first phase of the experimentation, the maximum average precision value of 0.421and 0.304 are obtained for the Afaan Oromo and English documents respectively. The second phase of experimentation performs slightly better than the first. Maximum average precision value of 0.468 and 0.316 are obtained for the Afaan Oromo and English documents respectively. Therefore, with the use of large and cleaned parallel Afaan Oromo-English document collections, it is possible to develop CLIR for the language pairs.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/14517
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Cross-Lingual Information Retrieval (Clir)	en_US
dc.title	Afaan Oromo-English Cross-Lingual Information Retrieval (Clir): A Corpus Based Approach	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Daniel Bekele.pdf
Size:: 1.04 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Health Informatics