Network Traffic Classification Using Machine Learning: A Step Towards Over-the-Top Bypass Fraud Detection

No Thumbnail Available

Date

2018-11-14

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Over-the-Top (OTT) bypass is a type of Interconnect Bypass fraud where regular voice calls are rerouted through OTT network and terminated as an OTT call. These calls are terminated using OTT applications which need user’s Mobile Station International Subscriber Directory Number (MSISDN) for authentication. Detecting OTT voice call packets through different network traffic classification techniques is one subtask in the detection of this fraud. In this thesis, performance of three machine learning algorithms; Adaptive Booster (AdaBoost) + J48, Repeated Incremental Pruning to Produce Error Reduction (RIPPER), and Support Vector Machine (SVM) is evaluated in detecting MSISDN-based OTT packets taking Viber, Tango, and Telegram as a sample. Detection of OTT traffic and voice call packets from the OTT traffic have been treated separately as classification tasks. Ten cross-fold and separate test data validation techniques together with 1.7 million labeled packets generated and captured in controlled laboratory environment are used in the evaluation process. AdaBoost + J48 achieved the best accuracy on both classification tasks compared to the others while using ten cross-fold validation. However, an accuracy of 48.4% obtained in detecting voice call packets while using separate test data validation makes it less preferable in the classification task. Even if it takes longer time to train SVM, it was the best performer (95.35% accurate) in detecting voice call packets in separate test data validation. Considering accuracy attained by the algorithms in separate test data validation technique together with the detection rate of OTT voice call packets, SVM is preferable than the other two algorithms.

Description

Keywords

OTT bypass, MSISDN-based OTT, Network traffic classification, Machine learning

Citation