Automatic Amharic News Text Summarizer (Extraction)

No Thumbnail Available

Date

2001-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

It is visible that the amount of textual information output is highly increasing from day to day. Compared to the text output the human capacity of reading is almost negligible. This big difference creates a problem in communicating information to the best possible extent. Managing the output also becomes very difficult. Tasks of sorting, searching through and categorizing are turning out to be cumbersome. The limited carrying capacities of the communication channels also require huge reduction in size. The focus of this research is on development of a mechanism for shortening Amharic news texts and for producing concise summaries of them. The system !lies to pin point the most important sentences of the original text and extract them as a summary of the news. Thus the extract is a lot shorter and painless to handle. The proposed summarizer uti I izes several statistical techniques, location heuristics and diagnostic units to determine the parts of the text to be extracted. Selected information retrieval and text mining techniques are adopted to build a model for the proposed system. The application of the system alter adjusting the weight of its diagnostic units by using four Amharic news items in 124 different ways reveals a promising result in automating the task of generating news summaries. Human generated summaries are used for adjusting weight and evaluating the system. Finally 58% Recall and 70.4% Precision values are attained. Based on this result, further work is recommended for future improvements of this system and studies in the area of automatic Amharic text summarize

Description

Keywords

Information Science

Citation