Trilingual Sentiment Analysis on Social Media

No Thumbnail Available

Date

2018-03-04

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

The connectivity and availability of social media platforms in the world allows people to express sentiments and share experiences easily by providing sentiments. Analyzing sentiment data from social media helps to understand customers feeling, make decision and design strategies. The social media sentiment data can contain multiple languages and mixed words of languages. However, it is difficult to analyze the sentiments of the data that contain multiple languages and mixed languages of words. This research work is, therefore, aimed to design sentiment analysis on social media data to analyze trilingual (i.e., English, Amharic and Tigrigna) sentiment sentences based on lexicon approach. The trilingual sentiment analysis system consists of seven main components: preprocessor, language identifier, morphological analyzer, sentence constructor using root words, sentiment word detector, sentiment word polarity weight determiner and sentiment classifier. The preprocessor cleans the data, tokenizes sentiment sentence into lists of words and performs normalization. The language identifier identifies the language of the texts. The morphological analyzer analyzes morphology of sentiment text words in order to handle the morphological complexity of the languages and to detect sentiment words in the sentence. Sentiment word weight determiner determines the polarity weight of the sentiment words based on the sentiment lexicons. The sentiment classifier classifies sentiment sentences into positive, negative and neutral polarity classes using the polarity weight values of the sentiment words in the sentiment sentences. The prototype of the system was developed to test and evaluate the functionality of the system. To test and evaluate the system, 564 sentiment sentences were collected from Facebook and YouTube. To measure the accuracy of the system in sentiment classification precision, recall and F-measure evaluation metrics were employed, and an average precision 87.49%, recall 84.78% and F-measure 85.99% were obtained.

Description

Keywords

Social Media, Sentiment Analysis, Trilingual Sentiment Analysis, Sentiment Words, Multilingual, Sentiment Polarity

Citation