SIM-Box Fraud Detection Using Data Mining Techniques: The Case of ethio telecom
No Thumbnail Available
Date
2018-11
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
AAU
Abstract
Telecommunication fraud is one of the threat of telecom operators as it drives telecom operators
to loose a portion of their annual revenue. Bypass fraud is most worrying fraud type in
today’s telecom business. The advent of new technologies provided fraudsters new techniques
to device bypass fraud. Subscriber Identity Module box (SIM box) fraud is the popular type of
bypass fraud, that has emerged with the use of Voice Over Internet Protocol (VoIP) technologies.
SIM box is used to terminate international calls by diverting away from the legitimate
interconnect gateway route. SIM box fraud is more common in the operators where their tari
of international call termination is much higher than the local call tari . This high tari is
a common method of subsidizing telecom infrastructure in the developing world. However, it
creates strong motivation for fraudsters.
Among various fraud prevention approaches, the use of monitoring call patterns and pro les
through Fraud Management Systems and Test Call Generators are common one. Yet, both approaches
have drawbacks which make them insu cient because they are easily overcome by
fraudsters. Therefore, the need for more sophisticated techniques is inevitable. In recent years,
datamining techniques have gained popularity in fraud detection.
In this research, models were developed to classify Call Detail Records (CDRs) to propose a
model that di erentiate fraudulent from legitimate subscribers with better performance. Three
classi cation techniques, Random Forest (RF), Arti cial Neural Network (ANN) and Support
Vector Machine (SVM), and three user pro ling datasets, 4 hour, daily and monthly aggregated
were proposed. These three algorithms along with the three datasets were applied in building
the models. Results of the work show that RF performed better among the three algorithms
with accuracy of 95.99% and a lesser false-positive on the 4 hour aggregated dataset.
Description
Keywords
SIM Box Fraud, telecom fraud,, Bypass Fraud, Fraud Detection, Data Mining, Machine learning, classification, Artificial Neural Network, Multi-layer perceptron, Support Vector Machine, Random Forest