Effect of morphological information in Afaan Oromo word sequence prediction

No Thumbnail Available

Date

2017-06

Journal Title

Journal ISSN

Volume Title

Publisher

A.A.U

Abstract

The purpose of conducting this study is to design Afaan Oromoo word sequence prediction model to explore the effect of morphological information on word sequence prediction. Word prediction is a natural language processing problem that attempt to predict the correct and most appropriate word in a given context; it utilizes language modeling application to guess the next word based on the context in which it has been previously used in a text. Even though, Afaan oromo is used by a large number of populations, no noteworthy work is done on the topic of word sequence prediction. Thus, in this study, word sequence prediction model for Afaan oromo was developed using statistical methods and morphological features. The researcher presented a model that predicts the most likely word based on statistical and morphological information of previous words. N-gram method was employed to construct a Bigram and Trigram language model from stem forms sequence. In addition, morphological properties of Afaan Oromoo verbs and nouns have been extracted using Hornmorph morphological analyzer to develop language model from stem form with morphological features such as tense, case, number, gender and person. Accordingly, the model was set out to suggest the next word to be typed by a user in three phases. Firstly, the most probable stem forms are predicted using language model. Secondly, morphological features are predicted for the proposed stem forms. Lastly, the proposed root or stem word and morphological features are used by morphological synthesizer to generate appropriate surface words. To evaluate the performance of the word sequence model and to demonstrate how morphological features determine the accuracy of word prediction models, the developed model was compared with a model that was developed without considering the morphological features. Accordingly, an experiment had been conducted based on Keystroke saving, and the result of the experiment indicated the better KSS is achieved with the model constructed from N-gram and morphological information. Based on the result of this study, specific research direction is recommended.

Description

A Thesis Submitted to the School of Graduate Studies of Addis Ababa University in Partial Fulfillment of The Requirements for the Degree of Master Of Science in Information Science

Keywords

Afaan Oromo, Morphology

Citation