Morphological Analyzer for Afaan Oromoo Using Machine Learning
No Thumbnail Available
Date
2020-03-03
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
This thesis describes the development of morphological analysis system for Afaan Oromoo. Morphological analysis is an analysis of words that aimed at segmenting words into their component morphemes and the assignment of grammatical information to grammatical categories. Many researches have been conducted in morphological analysis extensively for different languages, while this work is among a few works in Afaan Oromoo natural language processing applications. If there is a robust morphological analysis, there are many natural language processing applications that can be benefited from it. A new Afaan Oromoo morphological analysis is proposed based on memory-based learning. Memory-based learning techniques keep all training data available for classification and extrapolation without making any abstraction unlike eager learners. Because of its lazy property, memory-based learning achieved a higher accuracy than eager methods for many language processing tasks. The proposed morphological analyzer has two main components: training phase and analysis phase. The training phase comprises necessary components that are used in the process of training the learning component of memory-based learning. The analysis phase maps the input into output. A morphological database which consists of grammatical information of Afaan Oromoo nouns, verbs and adjectives has been developed. It contains 2270 annotated words. We performed an experiment in four scenarios to evaluate the system being developed based on memory-based algorithms: IB1 and IGTREE algorithms. We obtained the maximum generalization accuracy of 98.86% from IB1with interleaving and 94.36% from IGTREE with feature selection and interleaving. The result from our experiment shows that selecting the combination of features with highest accuracy plays a vital role on both default and optimal parameter settings. Examining the influence of feature justified that we used the best combination of features. Generally, the algorithms and techniques used in this research work obtained a good performance.
Description
Keywords
Morphological Analysis, Memory Based Learning, Afaan Oromoo