Network Traffic Classification Using Machine Learning: A Step Towards Over-the-Top Bypass Fraud Detection
No Thumbnail Available
Date
2018-11-14
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Over-the-Top (OTT) bypass is a type of Interconnect Bypass fraud where regular
voice calls are rerouted through OTT network and terminated as an OTT call. These
calls are terminated using OTT applications which need user’s Mobile Station International
Subscriber Directory Number (MSISDN) for authentication. Detecting
OTT voice call packets through different network traffic classification techniques is
one subtask in the detection of this fraud.
In this thesis, performance of three machine learning algorithms; Adaptive Booster
(AdaBoost) + J48, Repeated Incremental Pruning to Produce Error Reduction (RIPPER),
and Support Vector Machine (SVM) is evaluated in detecting MSISDN-based OTT
packets taking Viber, Tango, and Telegram as a sample. Detection of OTT traffic
and voice call packets from the OTT traffic have been treated separately as classification
tasks. Ten cross-fold and separate test data validation techniques together
with 1.7 million labeled packets generated and captured in controlled laboratory
environment are used in the evaluation process.
AdaBoost + J48 achieved the best accuracy on both classification tasks compared to
the others while using ten cross-fold validation. However, an accuracy of 48.4%
obtained in detecting voice call packets while using separate test data validation
makes it less preferable in the classification task. Even if it takes longer time to
train SVM, it was the best performer (95.35% accurate) in detecting voice call packets
in separate test data validation. Considering accuracy attained by the algorithms
in separate test data validation technique together with the detection rate
of OTT voice call packets, SVM is preferable than the other two algorithms.
Description
Keywords
OTT bypass, MSISDN-based OTT, Network traffic classification, Machine learning