Multilingual Text Detection and Script Recognition from Video Scene using Deeplearning

dc.contributor.advisorMenore, Tekeba (Mr.)
dc.contributor.authorKirubel, Gebrehiwot
dc.date.accessioned2020-03-06T07:06:13Z
dc.date.accessioned2023-11-04T15:14:39Z
dc.date.available2020-03-06T07:06:13Z
dc.date.available2023-11-04T15:14:39Z
dc.date.issued2019-10
dc.description.abstractScene Texts occur more frequently in most videos which may contain crucial information. The information may have contents such as location and time. In Ethiopia most information on the streets are posted using Ethiopic (Geez) and Latin Scripts. In our Research work we have studied Multilingual Text Detection, Script Identification and Character Recognition from Video Scene using Deep Learning Neural Network Model. The Videos being captured by the digital camera are processed and Keyframes are extracted using Keyframe Selection Algorithm, Text regions are detected by using Trained Convolutional Neural Network and those text regions which are found by bounding box regression are cropped out by taking their bounding box values. The use of Faster R-CNN that consists of dropout layer for text detection has achieved a 91% of precision, 92.9% recall and an execution time of 7.5 sec during testing the network. After taking those cropped text blocks, scripts are classified or identified by using a trained network through transfer learning into their script classes. Following the script identification Line Segmentation, Word segmentation and Character Segmentation using Horizontal and Vertical Projection profile are performed which are the preprocessing steps for Optical Character Recognition, where script identification has achieved 88.5% of accuracy without the use of dropout layer and 93.3% of accuracy with the use of dropout layer. The final phase of this work includes character recognition which lies on the previous text detection, and script identification phases, different epochs were considered during training the network to maximize the efficiency of the network to recognize characters. The network that was trained with an epoch size of 200 has achieved 0.0076% of error during testing. This shows that maximizing the number of epochs during setting the training options improves the character recognition performance while decreasing the error value to the minimum value.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/20927
dc.language.isoen_USen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectFaster R-CNNen_US
dc.subjectDeep Learning Neural Networken_US
dc.subjectOptical Character Recognitionen_US
dc.subjectAlexneten_US
dc.titleMultilingual Text Detection and Script Recognition from Video Scene using Deeplearningen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Kirubel Gebrehiwot.pdf
Size:
1.88 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: