Afaan Oromo Named Entity Recognition Using Hybrid Approach

dc.contributor.advisorMidekso, Dida (PhD)
dc.contributor.authorSani, Abdi
dc.date.accessioned2018-06-12T12:13:18Z
dc.date.accessioned2023-11-29T04:07:02Z
dc.date.available2018-06-12T12:13:18Z
dc.date.available2023-11-29T04:07:02Z
dc.date.issued2015-03
dc.description.abstractNamed Entity Recognition and Classification (NERC) is an essential and challenging task in Natural Language Processing (NLP), particularly for resource scarce language like Afaan Oromo(AO). It seeks to classify words which represent names in text into predefined categories like person name, location, organization, date, time etc.Thus, this paper deals with some attempts in this direction. Mostly researcher have applied Machine Learning for Afaan Oromo Named Entity Recognition(AONER) while no researchers have used hand crafted rules and hybrid approach for Named Entity Recognition(NER) task. This thesis work deals with AONER System using hybrid approach, which contains machine learning(ML) and rule based components. The rule based component has parsing, filtering, grammar rules, whitelist gazetteers, blacklist gazetteers and exact matching components. The ML component has ML model and classifier components. We used General Architecture for Text Engineering (GATE) developer tool for rule based component and Weka in ML part. By using algorithms and rules we developed, we have identified Named Entity (NE) from Afaan Oromo texts, like name of persons, organizations, location, miscellaneous.Feature selection and rules are important factor in recognition of Afaan Oromo Name Entity (AONE). Various rules have been developed like prefix rule, suffix rule, clue word rule, context rule, first name and last name rule. We have used AONER corpus of size 27588, which is developed by Mandefro [1].From this corpus we have used corpus of size 23000 for training and 4588 for testing of our work. And we havean average result of 84.12% Precision, 81.21% Recall and 82.52% F-Score. Keywords: Named Entity Recognition, Named Entities, GATE Developer, Weka, Afaan Oromoen_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/547
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectNamed Entity Recognition፤ amed Entities; Gate Developer; Weka, Afaan Oromoen_US
dc.titleAfaan Oromo Named Entity Recognition Using Hybrid Approachen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Abdi Sani.pdf
Size:
1.39 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: