SIM-Box Fraud Detection Using Data Mining Techniques: The Case of ethio telecom

No Thumbnail Available



Journal Title

Journal ISSN

Volume Title




Telecommunication fraud is one of the threat of telecom operators as it drives telecom operators to loose a portion of their annual revenue. Bypass fraud is most worrying fraud type in today’s telecom business. The advent of new technologies provided fraudsters new techniques to device bypass fraud. Subscriber Identity Module box (SIM box) fraud is the popular type of bypass fraud, that has emerged with the use of Voice Over Internet Protocol (VoIP) technologies. SIM box is used to terminate international calls by diverting away from the legitimate interconnect gateway route. SIM box fraud is more common in the operators where their tari of international call termination is much higher than the local call tari . This high tari is a common method of subsidizing telecom infrastructure in the developing world. However, it creates strong motivation for fraudsters. Among various fraud prevention approaches, the use of monitoring call patterns and pro les through Fraud Management Systems and Test Call Generators are common one. Yet, both approaches have drawbacks which make them insu cient because they are easily overcome by fraudsters. Therefore, the need for more sophisticated techniques is inevitable. In recent years, datamining techniques have gained popularity in fraud detection. In this research, models were developed to classify Call Detail Records (CDRs) to propose a model that di erentiate fraudulent from legitimate subscribers with better performance. Three classi cation techniques, Random Forest (RF), Arti cial Neural Network (ANN) and Support Vector Machine (SVM), and three user pro ling datasets, 4 hour, daily and monthly aggregated were proposed. These three algorithms along with the three datasets were applied in building the models. Results of the work show that RF performed better among the three algorithms with accuracy of 95.99% and a lesser false-positive on the 4 hour aggregated dataset.



SIM Box Fraud, telecom fraud,, Bypass Fraud, Fraud Detection, Data Mining, Machine learning, classification, Artificial Neural Network, Multi-layer perceptron, Support Vector Machine, Random Forest