Information System
Permanent URI for this collection
Browse
Browsing Information System by Subject "Amharic Morphology"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item Morphological Segmentation for Amharic Verb Class Using Recurrent Neural Network (RNN)(Addis Ababa University, 2019-09) Wondimagegnhue Tsegaye; Wondwossen Mulugeta (PhD)Due to the dependency of higher-level NLP task on morphological analysis, lack of an appropriate tool for morphological analysis is a major bottleneck for research work conducted on high level NLP application such as machine translation, speech processing, text summarization and many more. Currently most research works done on morphological analysis for morphologically rich languages (MRL) like Amharic are based on techniques that require high supervisions or rely on rule-based techniques that require detailed enumeration of the rule of the language to be crafted manually. Both of these techniques require a high-quality data in terms of capturing the rules that exist in the language and it also require a significant quantity of training data for better generalization. Both of these requirements are challenging to overcome due to the fact that significant number of MRL like Amharic are under-resourced. Lack of training data in quality and quantity is a major obstacle for research work in low resourced and morphological rich languages. The low resource state and morphological complexity of the language demand techniques that can provide better learning with relatively small number of example and be able to capture the complexity of language. In this paper, we propose RNN based sequence-to-sequence model that provides an encouraging performance in learning complex segmentation with small number of example and with no linguistic annotation, using the state-of-the-art encoder-decoder architecture. We have approached the problem of morphological segmentation as transformation task by considering the surface word as an input and the segmentation as a transformation process to produce a list of segmented morpheme. We prepared a training data by selecting different class of verbs. We have explored different encoder unit in terms directionality, window size, encoder type and size, and different data representation paradigm. The experiment showed that our model can learn a (;Omplex segmentation with no linguistic annotation and with limited number of examples. The model showed 74.2% accuracy on segmentation and 98.7% on morpheme boundary accuracy. We have shown that it is possible to learn morphological segmentation without relying on linguistic annotation. These contribute towards a general solution that can work on language other than Amharic. Our work can be extended to include other common POS classes such as nouns and adjective. Extending the work to include analysis would make the work more relevant for higher level NLP applications such as machine translation, speech recognition and spell checker. Key words: Recurrent Neural Network, Amharic Morphology, Encoder-Decoder Architecture.