Restoration and Retrieval of Historical Amharic Document Images

dc.contributor.advisorMeshesha, Million (PhD)
dc.contributor.authorMengistu, Biruk
dc.date.accessioned2018-11-26T08:26:19Z
dc.date.accessioned2023-11-29T04:56:47Z
dc.date.available2018-11-26T08:26:19Z
dc.date.available2023-11-29T04:56:47Z
dc.date.issued2014-06
dc.description.abstractMany historical document image collections are now being scanned and made available over the Internet or in digital libraries. However, it is to be noted that effective access to such information sources is limited because of lack of efficient retrieval schemes. The existing methods of searching and retrieving from document images can be conducted with the help of recognition-based (Optical Character Recognition) and recognition-free (Document Image Retrieval) or a combination of these two approaches. These algorithms try to analyze the global or local layout structure for different document images and estimate the similarity among them. A few researches have been conducted to develop a recognition-free document image retrieval system that extracts information from document images relying on image features only. These systems are highly affected by degradation in historical documents which results from paper aging, folding or scanning. In this study, an attempt is made to integrate effective image restoring techniques to enhance the effectiveness of the system in searching within historical document images. This study also improves the online searching process of the system by accepting N-query terms for retrieving relevant documents in addition to image viewer, towards enhancing the interface to the Amharic Document Image Retrieval System. In this study different images restoration techniques are experimented, such as Dilate, Erode and Combination of Mathematical Morphology techniques as well as Haar, Daubechies, and Symlet wavelet techniques. These techniques are experimented in historical documents as well as real life documents. Performance analysis shows that best result is obtained by combining mathematical morphology with Otsu thresholding. Finally, the performance of the system is evaluated before and after the integration of the selected restoring techniques in which an average overall performance of 87.02 % F-measure is registered in documents having low, medium and high levels of degradation with an improvement of retrieval effectiveness by 4.65 % F-measure. The performance registered in this study shows promising result for designing applicable Amharic document image retrieval. The major challenge is unavailability of standardized corpus and the dataset contains limited number of historical document images. Therefore, in the future a standardized corpus should be prepared and used for experimentation in similar studies.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/14510
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectRetrieval of Historical Amharic Document Imagesen_US
dc.titleRestoration and Retrieval of Historical Amharic Document Imagesen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Biruk Mengistu.pdf
Size:
6.46 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: