Interconnect Bypass Fraud Detection Model Using Data Mining Technique

No Thumbnail Available

Date

2019-08-08

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

Interconnect bypass fraud is a process by which official interconnect termination routes are being bypassed by using VoIP to route international call traffics into a SIM-Box device where calls are terminated and subsequently regenerated as local calls. According to communication fraud control associate (CFCA, 2017), it is categorized under a type of damage fraud along with subscription fraud. Telecom industry has been expanded dynamically as a result of the development of affordable technologies and an increasing demand of communications. However, the expansion in telecommunication industries in parallel motivated fraudsters to commit telecom fraud using different methods and techniques resulting in the decreasing of the revenue and quality of service in telecommunication providers. This thesis work focuses on predicting interconnect bypass fraud using different classfication techniques such as multilayer perceptron (MLP), support vector machine (SVM), random decision forest (RF), and J48 algorithms. To achieve our objective, call detail records (CDR) are collected from ethio telcom billing system for two months, from 41 millions active mobile subscribers. We applied cross-industrial standard process for data mining (CRISP-DM) model to the collected raw data; extracted important features from customers CDRs, and derived additional new features so as to characterize the behavior of interconnect bypass fraud. In addition, we preprocessed, aggregated and formatted the datasets convenient for the selected ML algorithms. Each algorithm was trained with five different aggregated datasets such as 4 hours, 8 hours, 12 hours, daily and weekly using two training modes (10-fold cross validation and percent split). The performance of the models were compared using confusion matrix and we proposed the best models for interconnect bypass fraud prediction. From our experiments, we found that J48 and RF models gave us the highest accuracy as compared to MLP and SVM by giving the classification accuracy of 99.99%, 99.99%, 99.84% and 95.61% respectively on 8 hours aggregated dataset.

Description

Keywords

Telecom Fraud, Bypass Fraud, SIM-Box, Fraud Detection, Data Mining, Knowledge Discovery, CRISP-DM Process Model, Supervised Machine Learning, Multilayer Perceptron, Support Vector Machine, J48, Random Forest

Citation