Concept -Based Automatic Amharic Document Categorization

Sahlemariam, Meron

Concept -Based Automatic Amharic Document Categorization

Files

Meron Sahlemariam.pdf (37.46 MB)

Date

2009-01

Authors

Sahlemariam, Meron

Publisher

Addis Ababa University

Abstract

Along with the continuously growing volume of information availability, there is a growing interest towards better solutions for finding, filtering and organizing these resources. Automatic text categorization can play an important role in a wide variety of more flexible, dynamic , and personalized information management tasks. The process of automatic text categorization involves calculating similarities between documents and categories using the information extracted from the document. In recent years, ontology-based document categorization method is introduced to solve the problem of document classifier. Previous works on keyword-based document categorization miss some important issues of considering semantic relationships between words. In order to resolve the existing problems, this study proposes a framework that automatically categorizes Amharic documents into predefined categories using knowledge represented in the News ontology. At the heart of the classification system is the knowledge base that enables the representation of different domain concepts. During the classification process, all the documents pass through pre-processing stages. Then index terms are extracted from a given document which is mapped onto their corresponding concepts in the ontology. Finally, the selected document is classified into a predefined category, based on the weighted concept. With the help of News domain entomologist, this study categorizes a given Amharic document into a specific predefined category . The study shows that the use of concepts for Amharic document categorize results in 92.9% accuracy which is a promising outcome. Keywords: Ontology, Keyword-based, Concept-based text categorization, Knowledge representation .

Keywords

Ontology, Keyword-based, Concept-based text categorization, Knowledge representation

URI

http://etd.aau.edu.et/handle/12345678/21533

Collections

Information Sciences

Full item page

Concept -Based Automatic Amharic Document Categorization

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections