Preprocessing of Mobile Captured Document Images

No Thumbnail Available

Date

2016-10-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

In the last years, mobile phones cameras have high resolution cameras. Cell phone cameras are convenient image acquisition devices: they are fast, versatile, mobile, and do not touch the object. OCR of a captured business card or captured image is a process that allows a printed text from the card/image to be recognized optically and converted into machine-readable code. The recognized text can be used for further use for example from a business card that contain a personal information can be used to save it on contact list in the mobile device. In OCR applications, however, cell phone cameras suffer from a number of limitations, like blur, lighting condition, alignment, and geometrical distortions. Moreover, the effectiveness of the system is highly dependent on the preprocessing techniques used. This study explores an effective preprocessing method that handles the noise created during the capturing process and the document noise. We use a SONY Xperia M2 mobile phone to capture the business card and we implemented the preprocessing techniques (skew correction, perspective correction, noise removal, binarization, and text region extraction) in MATLAB image processing tool. In this study, we only deal with the preprocessing step of the character recognition for a mobile captured business card. We applied a preprocessing techniques like skew detection and correction, perspective rectification, noise removal, image binarization and text region extraction on a test data set that are collected by the researcher. Experiment on different Skew and perspective correction methods and based document boundary correction is selected, this technique is supported by the user by putting the four vertex of the document image. From the three noise removal techniques, based on the experiment, Wiener filter with 1.92 MSE and 48.99 PSNR and from the three binarization techniques, Sauvola with 0.13 MSE and 57.62 PSNR found to perform best with the highest PSNR and lower MSE. And for text region extraction a modified Connected Component and Dilation method is used. The proposed approach after preprocessing detects the text region with 93.60% Precision and 99.99% Recall. The challenges in the study are detecting correctly the text region and non-text region, in some cases the text regions are detected as non-text and vice versa, if the document captured at dark environment, and if the captured document is misaligned these needs further study.

Description

Keywords

Mobile Captured Document

Citation