Multilingual Text Detection and Script Recognition from Video Scene using  Deeplearning

Kirubel, Gebrehiwot

Multilingual Text Detection and Script Recognition from Video Scene using Deeplearning

dc.contributor.advisor	Menore, Tekeba (Mr.)
dc.contributor.author	Kirubel, Gebrehiwot
dc.date.accessioned	2020-03-06T07:06:13Z
dc.date.accessioned	2023-11-04T15:14:39Z
dc.date.available	2020-03-06T07:06:13Z
dc.date.available	2023-11-04T15:14:39Z
dc.date.issued	2019-10
dc.description.abstract	Scene Texts occur more frequently in most videos which may contain crucial information. The information may have contents such as location and time. In Ethiopia most information on the streets are posted using Ethiopic (Geez) and Latin Scripts. In our Research work we have studied Multilingual Text Detection, Script Identification and Character Recognition from Video Scene using Deep Learning Neural Network Model. The Videos being captured by the digital camera are processed and Keyframes are extracted using Keyframe Selection Algorithm, Text regions are detected by using Trained Convolutional Neural Network and those text regions which are found by bounding box regression are cropped out by taking their bounding box values. The use of Faster R-CNN that consists of dropout layer for text detection has achieved a 91% of precision, 92.9% recall and an execution time of 7.5 sec during testing the network. After taking those cropped text blocks, scripts are classified or identified by using a trained network through transfer learning into their script classes. Following the script identification Line Segmentation, Word segmentation and Character Segmentation using Horizontal and Vertical Projection profile are performed which are the preprocessing steps for Optical Character Recognition, where script identification has achieved 88.5% of accuracy without the use of dropout layer and 93.3% of accuracy with the use of dropout layer. The final phase of this work includes character recognition which lies on the previous text detection, and script identification phases, different epochs were considered during training the network to maximize the efficiency of the network to recognize characters. The network that was trained with an epoch size of 200 has achieved 0.0076% of error during testing. This shows that maximizing the number of epochs during setting the training options improves the character recognition performance while decreasing the error value to the minimum value.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/20927
dc.language.iso	en_US	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Faster R-CNN	en_US
dc.subject	Deep Learning Neural Network	en_US
dc.subject	Optical Character Recognition	en_US
dc.subject	Alexnet	en_US
dc.title	Multilingual Text Detection and Script Recognition from Video Scene using Deeplearning	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kirubel Gebrehiwot.pdf
Size:: 1.88 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Engineering