Word Sequence Prediction for Amharic

dc.contributor.advisorAbebe Ermias (Ato)
dc.contributor.authorKifle Nuniyat
dc.date.accessioned2019-05-10T06:34:33Z
dc.date.accessioned2023-11-18T12:44:21Z
dc.date.available2019-05-10T06:34:33Z
dc.date.available2023-11-18T12:44:21Z
dc.date.issued2011-02-04
dc.description.abstractWord prediction is a popular machine learning task, which consists of predicting the next word in sequence of words. Literature shows that word sequence prediction could play a great role in real life applications including electronic based data entry. Word prediction deals with guessing what word comes after, based on some current information, and it is the main focus of this study. Even though Amharic is used by a large number of population, few works are done on the topic of word sequence prediction. Previous works on word prediction shows that statistical methods are not enough with highly inflected language and needs syntactical information. In this study, we developed Amharic word sequence prediction following the Design science research methodology with statistical methods using Hidden Markov Model. We used around 138,000 phrases to train the model by incorporating detailed parts of speech. The experiments were done using bigram and trigram models on a window size of two, five and seven. We explained the efficacy of part of speech tag in Amharic word sequence prediction. Evaluation was performed using developed model and keystroke savings (KSS) as a metrics. According to our experiment, prediction results using a bi-gram with detailed Parts of Speech tag model has higher KSS and performed slightly better compared to those without Parts of Speech tag. Therefore, statistical approach with detailed POS with window size of five has good potential on word sequence prediction for Amharic language.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/18226
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectWord Sequence Predictionen_US
dc.subjectParts of Speechen_US
dc.subjectN-Gramen_US
dc.titleWord Sequence Prediction for Amharicen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Nuniyat Kifle 2011.pdf
Size:
1.3 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: