Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
Repository logo
  • Colleges, Institutes & Collections
  • Browse AAU-ETD
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Abeto, Alemu"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Character Recognition of Bilingual Amharic-Latin Printed Documents
    (Addis Ababa University, 2018-11) Abeto, Alemu; Menore, Tekeba (Mr.)
    Optical character recognition (OCR), is system that automatically converts captured images of handwritten, typewritten or printed text documents into machine encoded text. In Ethiopia more than 80 language are spoken and those languages use either Amharic scripts or adopted Latin scripts. In such environment, in order to reach a larger cross section of people, it is necessary that a document should be composed of text contents in different languages written in Amharic and/or Latin characters. To prepare dataset, several documents were collected from different sources for both script types. Character images were collected for 231 Amharic characters and 52 characters for English (merged capital and small letters). Totally for 257-character classes, 49,087-character image are prepared to train and test the system. Randomly selected 80% of dataset were used to train the system where as remaining 20% for purpose of testing the accuracy. Data acquisition, image binarization, noise removal, skew correction, character segmentation, feature extraction and character classification are steps in developing character recognition system. A number of algorithms were implemented to develop the proposed OCR system. In this research work, it was discussed the process of developing an OCR for bilingual Amharic and Latin script using Convolutional Neural Network (CNN) which is feature extraction and character classification model. From the experiment 99.20% of classification accuracy was obtained when the number of neurons is 256 and with adaptive learning rate. In character segmentation stage, average of 98.85% accuracy was achieved for clear sample document and 95.86% for unclear sample documents. Therefore, overall recognition accuracy become 98.06 % and 95.09 % respectively.

Home |Privacy policy |End User Agreement |Send Feedback |Library Website

Addis Ababa University © 2023