Automatic Thesaurus Construction for Amharic Text Retrieval

dc.contributor.advisorMeshesha, Million (PhD)
dc.contributor.authorMekonnen, Andargachew
dc.date.accessioned2020-05-26T11:00:01Z
dc.date.accessioned2023-11-18T12:44:46Z
dc.date.available2020-05-26T11:00:01Z
dc.date.available2023-11-18T12:44:46Z
dc.date.issued2009-07
dc.description.abstractThesauri have been used for literary composition since their inception in 1852, but nowadays their primary use is for information retrieval. Even they are among the crucial components of retrieval systems which are typically used for enhancing indexing operations and query expansions during searching. Even though Amharic language has been a written language for a couple of centuries and huge volumes of Amharic electronic documents are accumulated, not much has been done towards the development of effective and efficient Amharic retrieval systems. In this research work much effort has been exerted to generate thesaurus automatically for text retrieval in order to help the development of an effective and efficient Amharic retrieval system. The development of the automatic thesaurus generation system is based on the WOROSPACE model. The WOROSPACE model is derived from the inverted file index by applying Random Projection algorithm for dimensionality reduction. Nearest Neighboring clustering algorithm is employed to generate thesaurus automatically from the WOROSPACE model constructed An encouraging result is obtained in the experimentation of the system on Amharic Bible documents. During experimentatIOn the accuracy of the automatically generated thesaurus is evaluated The result on a random sample of ten terms shows that the system has accuracy of 58%. To further investigate its applicability for Amharic information retrieval, the thesaurus is integrated to an IR system for query expansion. The retrieval system is tested with and without using thesaurus in order to show the improvement made 111 retrieval effectiveness. Performance analysis shows that the recall of the system while using thesaurus is superior to not using it. The average recall values are 73.34% and 3729% after and before using thesaurus for query expansion, respectively keywords Amharic Thesaurus , WORDS PACE, Information Retrieval (IR)en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/21316
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectAmharic Thesaurusen_US
dc.subjectWORDS PACEen_US
dc.subjectInformation Retrievalen_US
dc.subject(IR)en_US
dc.titleAutomatic Thesaurus Construction for Amharic Text Retrievalen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Andargachew Mekonnen.pdf
Size:
26.92 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: