The Application of OCR Techniques to the Amharic Script
No Thumbnail Available
Date
1997-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Nowadays, it is becoming increasingly important to have information available for
examination and manipulation in digital format, and Optical Character Recognition (OCR)
is being recognized as one of valuable instruments in this respect. OCR systems take
optical images of a handwritten or printed material, and by recognizing the characters that
make up the material, automatically convert the text in the material into digital format for
further processing and manipulation - thereby bypassing the labour-intensive and error prone
as well as time consuming process of keying.
While the use and application of OCR systems seems to have been well developed in
languages that use scripts based on Latin, Chinese, Japanese, Bangia, to mention but a few,
there is not as yet any effort in this direction for the Amharic language.
This study is an attempt to approach the development of an Amharic OCR system by
drawing experience elsewhere - to investigate the extent to which suggested OCR
algorithms to work with other scripts would apply to recognizing Amharic characters. To
this end, algorithms suggested for use in other languages are reviewed from published
literature. The Amharic writing system is described in terms of size, shape, style, etc.
Algorithms of general appeal to the Amharic character recognition are selected.
Experimentation with a step-by-step segmentation and recognition based on topological
features of Amharic characters is presented. Recommendations are also made to further the
experiment and enhance the performance and applicability of the selected algorithms.
Description
Keywords
Information Science