Word Sequence Prediction for Afaan Oromo

dc.contributor.advisorAssabie, Yaregal (PhD)
dc.contributor.authorBekele, Ashenafi
dc.date.accessioned2019-11-26T07:34:46Z
dc.date.accessioned2023-11-04T12:22:46Z
dc.date.available2019-11-26T07:34:46Z
dc.date.available2023-11-04T12:22:46Z
dc.date.issued3/3/2018
dc.description.abstractData entry is a core aspect of human computer interaction. Text prediction is one of data entry systems to a computer and other hand held electronics device. It is a process of guessing the words which are likely to follow in a given text segment by displaying a list of the most probable words that could appear in that position. Word sequence prediction assists physically disabled individuals who have typing difficulties, speed up typing speed by decreasing keystrokes, helps in spelling and error detection and it also helps in speech recognition and hand writing recognition. Even if Afaan Oromo is one of the major languages widely spoken and written in Ethiopia, there is no research conducted on the area of word sequence prediction. Hence, due to the absence of word sequence prediction for Afaan Oromo, people are not enjoying the core benefits of word sequence prediction. In this study, word sequence prediction model is designed and developed. We used the bi and tri-word statistics, and the bi-, and tri POS tag statistics of the language. Initially, the training corpus and user inputs are tokenized and then morphologically analyzed. Subsequently, word statistics model is built for root or stem word and POS tag statistics model is built for root or stem with tag like noun, verb, adjective, pronoun, adverb, conjunction and etc. by using training corpus. After that, the most likely probable root or stem words are suggested. Finally, lexical words are synthesized based on the proposed root or stem words. The designed model is evaluated based on the developed prototype. Keystroke Saving (KSS) is used to evaluate systems performance. According the evaluation the primary word-based statistical system achieved 20.5% KSS, and the second system that used syntactic categories with word-statistics achieved 22.5% KSS. Therefore, statistical and linguistic rules have good potential on word sequence prediction for Afaan Oromo.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/20253
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectWord Predictionen_US
dc.subjectStatistical Language Modelingen_US
dc.subjectPOS Taggingen_US
dc.subjectKeystroke Savingen_US
dc.titleWord Sequence Prediction for Afaan Oromoen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Ashenafi Bekele 2018.pdf
Size:
1.08 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections