Word Sequence Prediction for Amharic Language

Tensou, Tigist

Word Sequence Prediction for Amharic Language

dc.contributor.advisor	Assabie, Yaregal (PhD)
dc.contributor.author	Tensou, Tigist
dc.date.accessioned	2018-06-26T05:28:06Z
dc.date.accessioned	2023-11-04T12:23:55Z
dc.date.available	2018-06-26T05:28:06Z
dc.date.available	2023-11-04T12:23:55Z
dc.date.issued	2014-10
dc.description.abstract	The significance of computers and handheld devices are not deniable in the modern world of today. Texts are entered to these devices using word processing programs as well as other techniques. Text prediction is one of the techniques that facilitates data entry to computers and other devices. Predicting words a user intends to type based on context information is the task of word sequence prediction, and it is the main focus of this study. Word prediction can be used as a stepping stone for further researches as well as to support various linguistic applications like handwriting recognition, mobile phone or PDA texting, and assisting people with disabilities. Even though Amharic is used by a large number of populations, no significant work is done on the topic of word sequence prediction. In this study, Amharic word sequence prediction model is developed using statistical methods and linguistic rules. Statistical models are constructed for root or stem, and morphological properties of words like aspect, voice, tense, and affixes using the training corpus. Consequently, morphological features like gender, number, and person are captured from a user‘s input to ensure grammatical agreements among words. Initially, root or stem words are suggested using root or stem statistical models. Then, morphological features for the suggested root or stem words are predicted using voice, tense, aspect, affixes statistical information and grammatical agreement rules of the language. Predicting morphological features is essential in Amharic because of its high morphological complexity, and this approach is not required in less inflected languages since there is a possibility of storing all word forms in a dictionary. Finally, surface words are generated based on the proposed root or stem words and morphological features. Evaluation of the model is performed using developed prototype and keystroke savings (KSS) as a metrics. According to our experiment, prediction result using a hybrid of bi-gram and tri-gram model has higher KSS and it is better compared to bi-gram and tri-gram models. Therefore, statistical and linguistic rules have quite good potential on word sequence prediction for Amharic language. Keywords: Hornmorph, Keystroke Saving, Natural Language Processing, Parts-of-Speech, Word Prediction	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/3382
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Hornmorph	en_US
dc.subject	Keystroke Saving	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Parts-Of-Speech	en_US
dc.subject	Word Prediction	en_US
dc.title	Word Sequence Prediction for Amharic Language	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Tigist Tensou.pdf
Size:: 1.33 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science