Automatic Sentence Based Image Description Generation Framework

dc.contributor.advisorGetahun, Fekade (PhD)
dc.contributor.authorShimelis, Yordanos
dc.date.accessioned2020-09-04T10:56:56Z
dc.date.accessioned2023-11-09T16:18:47Z
dc.date.available2020-09-04T10:56:56Z
dc.date.available2023-11-09T16:18:47Z
dc.date.issued2019-09-09
dc.description.abstractSentence-based image description generation is a challenging task involving several open problems in the fields of Natural Language Processing and Computer Vision. To address this problem most of the previous efforts for this task rely on visual clues and corpus statistics. The generation approaches employ both concepts-to-text and text-to-text natural language generation methods, which generate image description by transferring text from descriptions of a similar image and generate a summary for a new image from retrieval related document but do not take much advantage of the semantic information inherent in the available image descriptions. Since these approaches have no capable of building novel descriptions. We focus on novel descriptions generation for unseen images. Here, we present a generic approach, which benefits from two sources visual data and available descriptions simultaneously. Our approach works on syntactically and linguistically motivated phrases extracted from the human descriptions. The proposed framework has three main components, which are called Image Engine, Search Engine, and Text Engine. Image Engine does feature extraction from training image dataset and provide to indexer sub-component, after indexation is completed visual word, construction is done by clustering local descriptor. Search Engine does feature extract from unseen image and compute similarity measure between image feature in the index and unseen image. The text engine does syntactically, and linguistically motivated phrases extracted from the textual descriptions and generate linguistic model. Then each image associate with linguistic model. Finally, text engine does assemble phrases into a grammatically correct sentence. Experimental evaluations demonstrate that our design mostly generates well-spoken and semantically correct descriptions. In order to validate the proposed approach, a Java-based prototype is developed. We used LIRe and Lucerne for low-level features extraction and indexation and for phrase extraction, we used Stanford core NLP and for sentence generation, we used SimpleNLG. Using relative metrics such as recall and precision measures were conducted using sample test images. The experimental result gives 60% recall and 75% precision.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/22265
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectText Engineen_US
dc.subjectImage Engineen_US
dc.subjectSearch Engineen_US
dc.subjectImage Indexen_US
dc.subjectPhrase Relevance Evaluationen_US
dc.subjectPhrase Integrationen_US
dc.titleAutomatic Sentence Based Image Description Generation Frameworken_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Yordanos Shimelis 2019.pdf
Size:
3.37 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections