Amharic Sentence Generation from Interlingua Representation

dc.contributor.advisorAssabie, Yaregal (PhD)
dc.contributor.authorYitbarek, Kibrewossen
dc.date.accessioned2022-03-23T08:51:36Z
dc.date.accessioned2023-11-29T04:06:40Z
dc.date.available2022-03-23T08:51:36Z
dc.date.available2023-11-29T04:06:40Z
dc.date.issued2016-12-27
dc.description.abstractSentence generation is a part of Natural Language Generation (NLG) which is the process of deliberately constructing a natural language text in order to meet specified communicative goals. The major requirement of sentence generation in a natural language is providing full, clear, meaningful and grammatically correct sentence. A sentence can be generated from different possible sources, including a representation which does not depend in any human languages, which is an Interlingua. Generating a sentence from an Interlingua representation has numerous advantages. Since Interlingua representation is unambiguous, universal and independent of both the source language and the target language, the generation should be target language-specific, and likewise should be the analysis. Among the different Interlinguas’, Universal Networking Language (UNL) is commonly chosen in view of various advantages over the other ones. Various works have been done so far for different languages of the world to generate sentences from UNL expression but to the best of our knowledge there are no works done so far for Amharic language. In this thesis, we present Amharic sentence generator that automatically generates Amharic sentence from a given input UNL expression. The generator accepts a UNL expression as an input and parses to build a node-net from the input UNL expression. The parsed UNL expressions are stored in a data structure which could be easily modified in the successive processes. UNL-to-Amharic word dictionary is also prepared and it contains the root form of Amharic words. The Amharic equivalent root word and attributes of nodes in a parsed UNL expression will be fetched from the dictionary to update the head word and attributes of the corresponding node. Then, the translated Amharic root words will be locally reordered and marked based on the Amharic grammar rules. When the nodes are ready for generation of morphology, the proposed system makes use of Amharic morphology data sets to handle the generation of noun, adjective, pronoun, and verb morphology. Finally, the function words are inserted to the morphed words so that the output matches with a natural language sentence. The evaluation of the proposed system has been performed on dataset of 142 UNL expressions. Subjective tests like adequacy and fluency tests have been performed on the proposed system. Moreover, the quantitative test or error analysis has also been performed by calculating Word Error Rate (WER). From this analysis, it has been observed that the proposed system generates 71.4% sentences that are intelligible and 67.8% sentences that are faithful to the original UNL expression. Consequently, the system achieved a fluency score of 3.0 (on a 4-point scale) and adequacy score of 2.9 (on a 4-point scale). Furthermore, the proposed system has word error rate of 28.94%. These scores of the proposed system can be improved further by improving the rule base and lexicon.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/30779
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectNatural Language Generationen_US
dc.subjectInterlinguaen_US
dc.subjectUniversal Network Languageen_US
dc.subjectUniversal Worden_US
dc.subjectHead Worden_US
dc.subjectAttributeen_US
dc.subjectLocal Reorderingen_US
dc.subjectMorphology Generationen_US
dc.subjectFluencyen_US
dc.subjectAdequacyen_US
dc.titleAmharic Sentence Generation from Interlingua Representationen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Kibrewossen Yitbarek 2016.pdf
Size:
1.86 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: