Skip navigation
 

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/15412
Title: A Generic Approach towards All Words Amharic Word Sense Disambiguation
???metadata.dc.contributor.*???: Dr. Million Meshesha
Siraj, Dureti
Issue Date: Feb-2017
Publisher: Addis Ababa University
Abstract: Sense disambiguation is an “intermediate task” which is helpful in other NLP tasks like machine translation, information retrieval and hypertext navigation, content and thematic analysis, grammatical analysis, speech processing and text processing. This study attempts to explore a more general approach to develop a WSD for Amharic language. To this end, a WSD system that identifies a sense of an Amharic ambiguous word by using information from tagged example sentences and Word-Net is developed. The system identifies the sense by measuring similarity between the input sentence and tagged example sentences. Two similarity measures are explored: Cosine similarity and Jaccard Coefficient similarity measure. We have collected 100 example sentences for each sense of the selected Amharic ambiguous words. The Word-Net is composed of words with their sysnonyms and gloss definition. The performance of the system is tested using 9 nouns, 3 verbs, 3 adjectives and 2 adverbs, a total 17 words which are selected randomly. The experiments were done for disambiguating one target word in a given text.The experimental step is designed in such a way that, first the performance of Cosine similarity and Jaccard coefficient are checked individually for WSD, next Lesk algorithm is tested on the third experiment and then experiments were conducted to check the performance of the two similarity measures as combined with Lesk algorithm. The result showed that Jaccard coefficient combined with Lesk algorithm come up with the highest result, which is 89.83% accuracy. The major challenge during the disambiguation process is that for those words that are frequently collocated with similar words in their different senses the system come up with a least accuracy.
Description: A Thesis submitted to Addis Ababa University in partial fulfillment of the requirement for the Degree of Masters of Science in Information Science
URI: http://hdl.handle.net/123456789/15412
Appears in Collections:Thesis - Information Science

Files in This Item:
File Description SizeFormat 
Dureti Siraj Bekeli.pdf2.04 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.