Morphological Analysis of Ge’ez Verbs Using Memory Based Learning

dc.contributor.advisorAssabie, Yaregal (PhD)
dc.contributor.authorAbate, Yitayal
dc.date.accessioned2018-06-26T07:44:42Z
dc.date.accessioned2023-11-29T04:05:53Z
dc.date.available2018-06-26T07:44:42Z
dc.date.available2023-11-29T04:05:53Z
dc.date.issued2014-11-07
dc.description.abstractGe‟ez is the classical language of Ethiopia and still used as the litrugical language of EOTC. Many ancient literatures were written in Ge‟ez. The literature includes religious texts and secular writings. The ancient philosophy, tradition, history and knowledge of Ethiopia were being written in Ge‟ez. For automatic analysis of these documents Ge‟ez morphological analysis is needed. Morphological analyzer is one of the most important basic tools in automatic processing of any human language. It analyses the naturally occurring word forms in a sentence and identifies the root word and its features. In this study, we used MBL to automatically analyze the morphology of Ge‟ez verbs. The system has two components: training and analysis. In the training phase, we identified the annotation process for our dataset in a character based representation of features. Then, these annotated dataset are extracted in a fixed length of instance vectors using windowing method. Next, instances are passed to the memory based learning tool (TiMBL). Finally, the learning model is built. On the other hand, the analysis phase performs instance making by extracting features from the given text to have similar structure of features during comparison. Then the extracted features are passed to the morpheme identification process to be compared with individual instances in memory and stems are extracted with their morpheme functions. Finally, the roots are extracted from the stems. The system was developed using python where we used TiMBL‟s IB2 and TRIBL2 algorithm for implementation. The performance of the system has been evaluated using 10-fold cross-validation technique. Testing was done using the default and optimized parameter settings. The overall accuracy with optimized parameters using IB2 and TRIBL2 was 93.24% and 92.31%, respectively. Similarly, the overall precision, recall and F-score with optimized parameters using IB2 were 55.6%, 56.3% and 59.95%, respectively. In the same manner the precision, recall and F-score using TRIBL2 were 58.8%, 60.3% and 59.54%, respectively. Moreover, a learning curve was drawn. The graph showed that as the number of training dataset increase, the accuracy on unseen data can be increased. Therefore, IB2 algorithm shows better result than TRIBL2 algorithm for Ge‟ez verb morphology. Key words:- Ge‟ez Verbs, Ge‟ez Morphology, Morphological Analyzer, Memory Based Learning, Character Based Analysis, Cross Validation, Feature Extractionen_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/3557
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectGe‟ez Verbsen_US
dc.subjectGe‟Ez Morphologyen_US
dc.subjectMorphological Analyzeren_US
dc.subjectMemory Based Learningen_US
dc.subjectCharacter Based Analysisen_US
dc.subjectCross Validationen_US
dc.subjectFeature Extractionen_US
dc.titleMorphological Analysis of Ge’ez Verbs Using Memory Based Learningen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Yitayal Abate.pdf
Size:
2.22 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: