Amharic DBpedia Extraction

Getahun, Biniyam

Amharic DBpedia Extraction

Files

Biniyam Getahun.pdf (21.31 MB)

Date

2015-03

Authors

Getahun, Biniyam

Publisher

Addis Ababa University

Abstract

Knowledge base is a technology used to store complex structured and unstructured data used by computer. Today, most knowledge bases cover just particular domain that is created by a small group of knowledge engineers because building general domain base knowledge is cost ly and time taking to cover a ll domains. Wikipedia has developed into one of the focal knowledge so urces for everyone and is kept up by a large number of contributors but its structure has some issue to use as knowledge source. The DBpedia project goes for extracting in formation based on semi-structured information by presenting Wikipedia articles, interlinking it with other knowledge bases, and publishing it as RDF triples openly on the Web. So far, the DBpedia project has succeeded in creatin g one of the largest knowledge bases on the web data, which is used in many applications and research prototypes. DBpedia extraction is extracts structured data (RDF) from Wikipedia. This study describes the effort to extract Amharic DBpedia. During the extraction process, the extraction design present by considering Amharic language. The tool used to extract Amharic DBpedia is 118n extract ion framework. The result shows more than quarter million Amharic RDF trip les extracted. In addition to this achievement, the improvement of Amharic Wikipedia infoboxes could increase the quality of extracting RDF triples. The result also shows extracting Amharic DBpedia is applicable and the language can be a part of the internationalized DBpedia chapter. Even if the study shows encouraging results, there are some remaining work needs to be done to get full Amharic DBpedia chapter. Abstract and homepage extractions must include having a full version of Amharic DBpedia chapter. Live base DBped ia extraction can be a considerable in the future work because it can get dynamic knowledge from Wikipedia and has a capability to deliver in stant RDF triples. Building Amharic knowledge bases, including Amharic DBpedia RDF store helps in order to facilitate access and querying structured data. Furthermore, the Amharic triple store can be knowledge source for NLP tasks and web applications. Keywords: DBpedia, 118n Extraction fram ework, RDF, Semantic Web, Wikiped ia.

Keywords

DBpedia, 118n Extraction fram ework, RDF, Semantic Web, Wikiped ia

URI

http://etd.aau.edu.et/handle/123456789/25310

Collections

Linguistics

Full item page

Amharic DBpedia Extraction

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections