Line Fitting to Amharic OCR: the Case of Postal Address
No Thumbnail Available
Date
2003-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Currently researchers are attracted to the area of Optical Character
recognition primarily due to challenging nature of the research and secondly
due to the industrial importance that it provides in the area of Reading
machine for the Blind, postal Address interpretation, Bank Curtsey amount
processing, hand filled form processing, and the like.
Research in the area of Amharic OCR systems is ongoing since 1997.
Attempts were made in adopting algorithm to Amharic language,
incorporating preprocessing techniques to the adopted algorithm, and in
generalizing the system so as it recognizes Type written characters as well as
hand written characters.
Sufficient amount of work is done in the areas of preprocessing such as
segmentation and Noise Removal. However, the consideration given to the
simplification of the feature extraction and the efforts made to alleviate the
problems of high dimensional input still requires the contribution of many
additional researches in order to come up with a system that the society can
use to solve real world problems.
To this end, Line fitting is used to Amharic Optical character recognition by
applying simple geometric calculations to determine features which could
represent and describe the character as uniquely and precisely as possible.
The image of a segmented character which is normalized into 32x32 pixels is
divided into 16 smaller squares of 8x8 pixels. Then the least square technique
was applied to fit a linear model to the distribution of foreground pixels and
three features were extracted from each smaller square.
Finally, a feed forward Neural Network trained using a back propagation
algorithm is used on handwriting of three individuals using a cross validation
technique as well as a separate test set and results are depicted on tables and
confusion matrices.
Relevant Conclusions were drawn and some valid recommendations were
forwarded to indicate future direction of further works on the area.
Description
Keywords
Character recognition