Lexicon-Stance Based Amharic Fake News Detection

dc.contributor.advisorSosina, Mengistu
dc.contributor.authorIbrahim, Neji
dc.date.accessioned2022-05-23T08:22:07Z
dc.date.accessioned2023-11-04T15:14:45Z
dc.date.available2022-05-23T08:22:07Z
dc.date.available2023-11-04T15:14:45Z
dc.date.issued2022-05
dc.description.abstractDue to the noisy nature of social media content, and the rapid propagation of false information, the identification, and detection of fake news become a challenging problem. In recent years, several studies propose to use text representation techniques from contentbased approaches to automatically detect fake news on the social media. However, fake news has a distinct writing pattern, and attempting to capture its distinguishing features may help us improve detection rather than focusing solely on text representation. In this study, we propose to combine the stance-based features (page score, headline to article similarity, and headline to headline similarities) with lexicon-based features from text representation techniques to enhance the detection performance. To build the detection model, we used three machine learning algorithms: Logistic regression, Passive Aggressive and Decision tree. The proposed approach is evaluated using a newly collected Amharic fake news dataset from Facebook. Our experiment results show that the hybrid features (lexicon-stance) are capable of improving the previous lexicon-based detection results by 4.1% accuracy, 3% precision, 4% recall, and 4% F1-score. In addition the hybrid feature improves the area under curve from 0.982 to 0.995 by reducing the false positive rate by 4% and improved the true positive rate by 4.4%. Furthermore, we found that page score, out of the proposed stance features included, has contributed the most to the improvement of lexicon-based fake news detection.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/31731
dc.language.isoen_USen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectContent-based detectionen_US
dc.subjectStance based detectionen_US
dc.subjectLexicon-based detectionen_US
dc.subjecttext representation techniquesen_US
dc.subjectFake newsen_US
dc.titleLexicon-Stance Based Amharic Fake News Detectionen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Ibrahim Neji.pdf
Size:
712.11 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: