Automatic sentence Parsing for Amharic Text an experiment using probabilistic Context free grammar

Alemu, Atelach

Automatic sentence Parsing for Amharic Text an experiment using probabilistic Context free grammar

dc.contributor.advisor	Biru, Tesfaye (PhD)
dc.contributor.author	Alemu, Atelach
dc.date.accessioned	2020-05-27T09:54:47Z
dc.date.accessioned	2023-11-18T12:44:50Z
dc.date.available	2020-05-27T09:54:47Z
dc.date.available	2023-11-18T12:44:50Z
dc.date.issued	2002-07
dc.description.abstract	Natural Language processing, as a field of scientific inquiry, plays an important role in increasing computers capability to understand natural languages, the language by which most human knowledge is recorded. Works in the area of Natural Language Processing try to design and implement computer programs that can understand natural language and act appropriately on the information contained in the text or utterance. Enabling computers to understand natural language involves extraction of meaning from natural language sentences. And one of the steps in this process is sentence parsing. Sentence parsing, which is also called syntactic parsing, is the process of identifying how words can be put together to form correct sentences and determining what structural role each word plays in the sentence and what phrases are subparts of what other phrases. A sentence parser outputs a parse structure that could be used as a component in many applications including semantic analysis, machine translation, information storage and retrieval of textual data etc. Today, parsers of different kinds (e.g. probabilistic, rule based) have been developed for languages, which have relatively wider use nationally and/or internationally (e .g. English, German, Chinese, etc). The same story is not true for Amharic, the working language of the Federal Government of Ethiopia, and one of the major languages of Ethiopia (Bender et ai, 1976) since to the best of my knowledge, there are no sentence parsers of any sort that process this language.Sentence parsing, which is also called syntactic parsing, is the process of identifying how words can be put together to form correct sentences and determining what structural role each word plays in the sentence and what phrases are subparts of what other phrases. A sentence parser outputs a parse structure that could be used as a component in many applications including semantic analysis, machine translation, information storage and retrieval of textual data etc. Today, parsers of different kinds (e.g. probabilistic, rule based) have been developed for languages, which have relatively wider use nationally and/or internationally (e .g. English, German, Chinese, etc). The same story is not true for Amharic, the working language of the Federal Government of Ethiopia, and one of the major languages of Ethiopia (Bender et ai, 1976) since to the best of my knowledge, there are no sentence parsers of any sort that process this language. This study, thus, attempted to develop a simple automatic parser for Amharic texts/sentences to address the need for developing systems that automatically process the Amharic language. In the study, the Inside Outside algorithm with a bottom up chart parsing strategy has been used. The probabilistic context free grammar has been used as a grammatical formalism to represent the phrase structure rules of the language. A small sample corpus was selected from sentences in the language, and has been used to serve as a training and test set. The sample was then hand parsed, automatically tagged, and was used as a corpus to extract the grammar rules and assign probabilities. The thesis, in short, describes processes of automatic sentence parsing using a combination of probabilistic and rule-based reasoning. It describes the whole process from manually parsing simple sentences to developing a prototype and conducting an experiment with it. The results obtained using the small manually parsed corpus seems to encourage further research to be launched, especially with the aim of developing a full-fledged Amharic sentence parser.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/12345678/21333
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Information Science	en_US
dc.title	Automatic sentence Parsing for Amharic Text an experiment using probabilistic Context free grammar	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Atelach Alemu 2.pdf
Size:: 34.04 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Information Sciences