Designing A Stemmer For Ge’ez Text Using Rule Based Approach
No Thumbnail Available
Date
2010-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
In this study, a stemmer of Ge’ez text was developed. In designing processes, different concepts
such as background for the thesis, literatures on conflation of the stemming algorithms,
morphological nature of Ge’ez language, stemming techniques and other realted things were
discussed in order to model and develop an automatic procedure for conflation.
When inflectional and derivational morphologies of the language were discussed, affixations
such as prefixing, infixing and suffixing are the main word formation processes in Ge’ez
language. The language is morphologically complex. This is because different words can be
formed due to the wide concatenations of affixes.
For the experiment, two techniques were used: affix removal and morphological analysis
techniques. To evaluate the stemmer, manually error counting technique was used.
From the experiment, three types of errors are observed: over stemmed (6%), under stemmed
(4.27%) and structural problems (7.31%). When the stemmer runs on the sample texts, it
performed with an accuracy of 82.42%.
The dictionary reductions of the stemmer were 29.9% to the stemmed words and 62.8% to root
words.
Lastly, the possible recommendations to future works and improvements of this work were
reported.
Description
Keywords
Ge’ez Text Using Rule Based