Design of Amharic Anaphora Resolution Model

dc.contributor.advisorAssabie, Yaregal (PhD)
dc.contributor.authorDawit, Temesgen
dc.date.accessioned2018-06-25T12:34:43Z
dc.date.accessioned2023-11-04T12:22:31Z
dc.date.available2018-06-25T12:34:43Z
dc.date.available2023-11-04T12:22:31Z
dc.date.issued2014-04
dc.description.abstractAnaphora resolution is the process of finding an entity which points backward to a word or phrase that has been introduced with more descriptive phrase in the text than the entity or expression which is pointing back. An entity referring back is called anaphor, whereas the word or phrase being referred is called antecedent. Anaphora resolution is used as a component in NLP applications like machine translation, information extraction, question answering and others to increase their effectiveness. Building complete anaphora resolution systems that incorporate all linguistic information is complex and still not achieved because of the different nature of languages and their complexities. In the case of Amharic language, it is even more complex because of its rich morphology. In addition to independent anaphors, unlike other languages like English, Amharic language has anaphors embedded inside words (hidden anaphors). In this work, we have proposed Amharic anaphora resolution model using knowledge poor anaphora resolution approach. The approach uses low levels of linguistic knowledge like morphology to build anaphora resolution systems avoiding the need of complex knowledge like semantic analysis, world knowledge and others. The proposed model takes Amharic texts as input and preprocesses to tag the texts with word classes and various chunks. Anaphors, both independent and hidden, and antecedents are identified from the preprocessed dataset. The model deals with both intrasentential and intersentential type of anaphors. Finally, the resolution process uses constraint and preference rules to identify the correct antecedent referred by the anaphor. To evaluate the performance of the model, Amharic texts are collected from Walta Information Center (WIC) and Amharic Holy Bible and used as datasets. The collected dataset was divided into training and testing datasets based on 10-fold cross validation technique. Based on the collected dataset, we achieved a success rate of 81.79% for resolution of hidden anaphors whereas an accuracy of 70.91% was obtained for resolution of independent anaphors. Keywords: Amharic anaphora resolution, knowledge poor anaphora resolution approach, hidden anaphors.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/3274
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectAmharic Anaphora Resolutionen_US
dc.subjectKnowledge Poor Anaphora Resolution Approachen_US
dc.subjectHidden Anaphorsen_US
dc.titleDesign of Amharic Anaphora Resolution Modelen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Temesgen Dawit.pdf
Size:
1.18 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections