Effect of Preprocessing on Long Short-Term Memory Based Sentiment Analysis for Amharic Language

dc.contributor.advisorSurafel, Lemma (PhD)
dc.contributor.authorTuregn, Fikre
dc.date.accessioned2020-10-09T10:00:29Z
dc.date.accessioned2023-11-04T15:14:42Z
dc.date.available2020-10-09T10:00:29Z
dc.date.available2023-11-04T15:14:42Z
dc.date.issued2020-07-04
dc.description.abstractThis paper presents effect of preprocessing on Long Short Term Memory (LSTM) based sentiment analysis for Amharic language. Sentiment analysis or opinion mining is an approach used to analyze user generated textual contents to a way that is important for decision making. User generated textual contents are found everywhere such as, social media posts, product reviews blogs and form. Developing sentiment analysis is a challenging task due to different writing styles and variation of word meanings. To analyze the sentiment of these textual contents, several approaches use labeled lexicons. In the preprocessing step of the approaches, Emojis are removed and words are stemmed. However, Emojis are usually used to express opinions. In this research, we propose to use Emojis to automatically label texts for sentiment analysis. In addition, we investigate the impact of using unstemmed words on sentiment analysis. To evaluate the proposed labeling scheme on sentiment anaslysis, we conducted an experiment using 9,138 Amharic textual comments. The results show that integrating Emojis with lexicons for labeling gives 0.55% higher accuracy than using only lexicons. To investigate the effect of using stemming as part of preprocessing strategy, LSTM based Amharic sentiment analysis with and without stemming is conducted using 1077 comments. Result shows that applying stemming drops the accuracy of the sentiment analysis by 6.43% while using long short-term memory based sentiment analysis, and 0.43% while using bi-gram multinomial naive bayes. Keyword: - Amharic sentiment analysis, Emoji, Natural Language Processing (NLP), Sentiment analysis, Stemming.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/22638
dc.language.isoen_USen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectLong Short-Term Memoryen_US
dc.subjectAmharic Languageen_US
dc.subjectPreprocessingen_US
dc.subjectSentiment Analysisen_US
dc.titleEffect of Preprocessing on Long Short-Term Memory Based Sentiment Analysis for Amharic Languageen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Turegn Fikre.pdf
Size:
699.29 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: