A Comparative Analysis of Machine Learning Algorithms for Subscription fraud Detection: The case of ethio telecom

No Thumbnail Available

Date

2020-02-21

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

In these days due to the development of affordable technologies, the number of subscribers and revenue-generating increased over the past few years in the telecommunication industry. However, advancements of the telecom industry provides certain appearances that stimulate fraudsters. One of the common and predominant fraud types is subscription fraud. It is usually the precursor to other fraud types. Since 2013 subscription fraud is listed as a top-five predominant fraud type. Subscription fraud alone causes billions of dollar losses of telecomm companies. This thesis is conducted on comparative performance of three supervised machine learning algorithms Artificial Neural Network (ANN), Support Vector Machine (SVM) and J48 , done using two classification techniques. Before analyzing and comparing the algorithms Call Detail Record (CDR) data were collected, relevant features were selected and various preprocessing techniques such as feature selection, data cleaning, shaping of data frame and feature types were performed. As a result, J48 algorithm using Cross Validation (CV) options is found to be the best classifier algorithm by scoring 99 .3 % accuracy followed by the two algorithms highest scores of ANN ( CV ) and SVM (ST) with 97 .51 % and 96 .0 % respectively. This result happens because of J48 ’s capable of learning disjunctive expressions in addition to it reduced error pruning. Pruning decreases the complexity in the final classifier, so that improves predictive accuracy from the decrease of over fitting.

Description

Keywords

telecommunications, CDR, fraud detection, ANN, SVM, J48, Machine learning, accuracy, CV, Supplied Test (ST)

Citation