OCR For Special Type of Handwritten Amharic Text ("Yekum Tsifet") Neural Network Approach
No Thumbnail Available
Date
2004-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Verbal and written communications, which are integral components of human society, have
been tram formed by the development of the respective communication devices. Through the
swift development in processing devices, the need and access to digitize printed information
items by means of Optical Character Recognition (OCR) became possible. Despite the fact
that most world languages are beneficiaries of this technology, the application of character
recognition technology to Amharic text is a recent experience, and in its infancy stage when
handwritten recognition is considered.
This study is then an attempt made to develop a recognition engine for Amharic handwringer
text written in a special type of writing style, which is called "Yekum Tsihuf"
(የቁም ጽሁፍ)Before mechanical and electronic text processors were introduced in
Ethiopia, information used to be recorded on natural materials by hand writing, animal skin
being the dominant one. Those handwritten documents, wine in this writing style, hold vital
information about history, tradition, religion, nature and etc., which render undeniable
contribution to current and future studies. The availability of this information in an electronic
form would greatly help preservation and communication.
In this study, the application of handwritten character recognition with Artificial Neural
Network implementation for the 231 main character set of Amharic language is all empted.
The training and test data sets are produced by scribers who are trained to write text using
the writing style. The study used various techniques at each phase from digitization to
recognition levels. Preprocessing methods like image binarization, character segmentation,
and size normalization and neural network recognition is made using Visual C++.Net and
MATLAB programming environments. While segmentation rate of 95.96% is attained using
stage-by-stage segmentation algorithm, recognition rate that ranges from 98.8% to 20.3% is
obtained for different test cases. Based on the findings and the knowledge acquired during the
experimentation, topics for filature research are also identified.
Description
Keywords
Information Science