Application of Multilingual Thesauri for Cross Language Information Retrieval (CLIR): Amharic • English Cross Language Information Retrieval for Legal Environment

No Thumbnail Available

Date

2005-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

It is crucial for cross language retrieval tools to be capable of translating queries and/or documents to make information accessible. In areas where such efficient systems are lacking for language pairs like Amharic and English, intermediate tools like the thesauri can be used to make translation of queries possible. This research has developed parallel thesauri for business law to be used in testing the retrieval of thesauri based translation of queries across the two languages. The thesauri are developed by taking the commercial code of Ethiopia as a representative corpus and domain experts in law were involved to make the conceptual analysis and facet determination. This procedure was supplemented by machine assisted indexing. An in-house retrieval system has been developed to test the application of the multilingual thesauri developed for this purpose. Thesauri based retrieval is tested where concept based translation of queries is attempted as compared to word based translations as in the case of machine translation. Queries have been collected from legal experts and a document collection in the area of law has been developed from research abstracts for this purpose. To test retrieval performance, queries are translated to their equivalent concepts using the thesauri and the equivalent concepts are used to query the collection in the target language. Retrieval outputs measured in terms of precision and recall show promising results as they have managed to retrieve relevant documents across languages. However, the obtained performance can be made better if other resources are made available as indicated in the recommendation. Therefore, these results have made the recommendation of other tools as well as approaches as being important to arrive at better retrieval performance.

Description

Keywords

Information Science

Citation