Morphological Analyzer for Afaan Oromoo Using Machine Learning

No Thumbnail Available

Date

2020-03-03

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

This thesis describes the development of morphological analysis system for Afaan Oromoo. Morphological analysis is an analysis of words that aimed at segmenting words into their component morphemes and the assignment of grammatical information to grammatical categories. Many researches have been conducted in morphological analysis extensively for different languages, while this work is among a few works in Afaan Oromoo natural language processing applications. If there is a robust morphological analysis, there are many natural language processing applications that can be benefited from it. A new Afaan Oromoo morphological analysis is proposed based on memory-based learning. Memory-based learning techniques keep all training data available for classification and extrapolation without making any abstraction unlike eager learners. Because of its lazy property, memory-based learning achieved a higher accuracy than eager methods for many language processing tasks. The proposed morphological analyzer has two main components: training phase and analysis phase. The training phase comprises necessary components that are used in the process of training the learning component of memory-based learning. The analysis phase maps the input into output. A morphological database which consists of grammatical information of Afaan Oromoo nouns, verbs and adjectives has been developed. It contains 2270 annotated words. We performed an experiment in four scenarios to evaluate the system being developed based on memory-based algorithms: IB1 and IGTREE algorithms. We obtained the maximum generalization accuracy of 98.86% from IB1with interleaving and 94.36% from IGTREE with feature selection and interleaving. The result from our experiment shows that selecting the combination of features with highest accuracy plays a vital role on both default and optimal parameter settings. Examining the influence of feature justified that we used the best combination of features. Generally, the algorithms and techniques used in this research work obtained a good performance.

Description

Keywords

Morphological Analysis, Memory Based Learning, Afaan Oromoo

Citation