Named Entity Recognition for Amharic Language

Ahmed Moges

Named Entity Recognition for Amharic Language

dc.contributor.advisor	H/Mariam Sebsibe (PhD)
dc.contributor.author	Ahmed Moges
dc.date.accessioned	2018-06-21T12:50:20Z
dc.date.accessioned	2023-11-29T04:05:45Z
dc.date.available	2018-06-21T12:50:20Z
dc.date.available	2023-11-29T04:05:45Z
dc.date.issued	2010-11
dc.description.abstract	Named Entity Recognition (NER) is a process of identifying and categorizing all named entities in a document into predefined classes like person, organization, location, time, and numeral expressions. This identification and classification of proper names in text has recently considered as a major importance in natural language processing as it plays a significant role in various types of NLP applications, especially in information extraction, information retrieval, machine translation, and question-answering. This paper reports about the development of a NER system for Amharic using Conditional Random Fields (CRFs). Though this state of the art machine learning method has been widely applied to NER in several well-studied languages, this is the first attempt to use this method to Amharic language. The system makes use of different features such as word and tag context features, part of speech tags of tokens, prefix and suffix. Since feature selection plays a crucial role in CRF framework, experiments were carried out to find out most suitable features for Amharic NE tagging task. During the experiment, four different scenarios were considered based on the different combination of features. In the first scenario all the features were considered, in the second scenario all the features except POS tags of tokens were considered. In the third and fourth scenarios all the features except prefix and suffix respectively were considered. The experimental results show that for different combinations of features, we have got different results. In scenario one experiment, we have got Precision, Recall and F-measure of 72%, 75% and 73.47% respectively. Taking this as a base line we made the remaining experiments. The remaining experiments on scenario two, three and fourth, its F-measure of 69.70%, 74.61%, and 70.65% respectively were obtained. From the above results, it is possible to make a conclusion that word context features, POS tags of tokens and suffix are important features in NE recognition and classification for Amharic text. Keywords: Named Entity Recognition, Conditional Random fields, Named Entities, Amharic Named Entity Recognition.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/2741
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Named Entity Recognition; Conditional Random Fields; Named Entities; Amharic Named Entity Recognition	en_US
dc.title	Named Entity Recognition for Amharic Language	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Moges Ahmed.pdf
Size:: 1.14 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Environmental Science