Information Extraction from Amharic Language Text: Knowledge-Poor Approach

Worku, Bekele

Information Extraction from Amharic Language Text: Knowledge-Poor Approach

dc.contributor.advisor	Assabie, Yaregal (PHD)
dc.contributor.author	Worku, Bekele
dc.date.accessioned	2018-06-14T12:20:14Z
dc.date.accessioned	2023-11-04T12:23:21Z
dc.date.available	2018-06-14T12:20:14Z
dc.date.available	2023-11-04T12:23:21Z
dc.date.issued	2015-06
dc.description.abstract	During the last two decades with the accelerated Internet development a great amount of data have been being accumulated and stored on the Web. We are drowns with much data at office, home either in printable or electronic form. Then finding the relevant information from this mass data is critical. At this end, information extraction is a technology which creates the structured representation of unstructured texts by extracting relevant entities from them, thereby, making the data analysis realizable. This work focuses on developing information extraction system from Amharic language text. The proposed system developed using GATE (General Architecture for Text Engineering) text processing environment using knowledge-poor approach on infrastructure domain. By knowledge-poor approach we mean we are using simple rules and gazetteer list for entity identification. Our proposed Amharic text information extractor consists of three phase’s namely preprocessing, extraction and post processing. The preprocessing phases used for handling language specific issues and setting the environment ready for extraction process. The second phase is the main unit in our model. It basically performs named entity recognition, coreference resolution and relation extraction and extract relevant text. The post processing step annotates the selected data and presents the extracted information in a structured form. Various evaluation techniques, which are used to evaluate the performance of our proposed model were used. The usual precision, recall and F-measure were used to measure the efficiency of the proposed work. We have used 24760 instances for training and testing our model. Our evaluation was conducted on name entity recognition component separately and the overall system as information extraction component. Accordingly, the system achieves the F-measure of 89.1 % on the named entity recognition and in the overall it achieves the F-measure of 89.8%. Key words: Information Extraction, Amharic Text Information Extraction, Coreference Resolution, Relation Extraction, GATE	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/944
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Information Extraction; Amharic Text Information Extraction; Coreference Resolution; Relation Extraction; Gate	en_US
dc.title	Information Extraction from Amharic Language Text: Knowledge-Poor Approach	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Bekele Worku.pdf
Size:: 1.19 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science