Amharic Hateful Memes Detection on Social Media
No Thumbnail Available
Date
2024-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Hateful meme is defined as any expression that disparages an individual or a group on
the basis of characteristics like race, ethnicity, gender, sexual orientation, country, religion,
or other characteristics. It has grown to be a significant issue for all social media
platforms. Ethiopia’s government has increasingly relied on the temporary closure of
social media sites but such kind of activity couldn’t be permanent solution so design
automatic system. These days, there are plenty of ways to communicate and make
conversation in chat spaces and on social media such as , text, image, audio, text with
image, and image with audio information. Memes are new and exponentially growing
trend of data on social media, that blend words and images to convey ideas. The
audience can become dubious if one of them is absent. Previous research on the identification
of hate speech in Amharic has been primarily focused on textual content.
We should design deep learning modal which automatically filter hateful memes in order
to reduce hate content on social media. The basis of our model consists of two
fundamental components. one is for textual features and the other is for visual features.
For textual features, we need to extract text from memes using optical character
recognition (OCR). The extracted text through the OCR system is pixel-wise, and the
morphological complex nature of Amharic language will affect the performance of the
system to extract incomplete or misspelled words. This could result in the limited detection
of hateful memes. In order to work effectively with an OCR extracted text,
we employed a word embedding method that can capture the syntactic and semantic
meaning of a word. LSTM is used for learning long-distance dependency between
word sequence in short texts. The visual data was encoded using an ImageNet-trained
VGG-16 convolutional neural network. In the studies, the input for the Amharic hateful
meme detection classifier combines textual and visual data. The maximum precision
was 80.01 percent. When compared to state-of-the-art approaches using memes as a
feature on CNN-LSTM, an average F-score improvement of 2.9% was attained.
Description
Keywords
social media, Memes, hate speech, word embedding, OCR, VGG-16, LSTM