Modeling Network Intrusion Detection System Based on Anomaly Approach Using Machine Learning Techniques

No Thumbnail Available

Date

3/5/2020

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

With the rapid growth use of information technology today, hacking and other unauthorized activities have dynamically increased than ever before. With the development of hardware and software, attacks are growing exponentially in type and number. Nowadays, network traffic classification has essential significance, due to the high growth of Internet users. A lot of threats are created every day by individuals and organizations to attack computer networks to steal private information and data. To protect these attacks, many organizations put into practice a broad defense such as configuring a strong firewall, authentication systems, encryption, antivirus, latest hardware and so on. Intrusion detection is another mechanism that is used to mitigate network intrusions. Many Intrusion Detection Systems have been developed for monitoring and detecting network or systems against any suspicious activity. In most of them, low detection rate, high training time, and a relatively high false alarm rate are obtained. To overcome the problems, we proposed an approach that integrates the concepts of machine learning, big data and anomaly detection for obtaining better results with improved processing speed. The proposed system has training, validation and testing main components. In the training component, the collected training data is preprocessed and fed to the classification model. Four classification models: Random Forest, Neural Network, Logistic Regression, and Decision Tree are used and compared. In the validation component, hyperparameter tuning is done using 5-fold cross-validation with a grid search technique for each of the machine learning algorithms to find the optimal value for each hyperparameter to improve the detection rate of the models. Then, the classification models are trained using the best parameters to build the final model. Finally, the final trained model is used to classify the test data into normal or attack. All of the classification models are implemented on Apache Spark big data framework. The experimental work is carried out using the NSL-KDD dataset which contains normal and attacks data. We split the dataset into 68% for training, 17% for validation and 15% for testing. The results show that almost all the algorithms give high prediction results. Among the algorithms, Neural Network has acquired the best result which is 99.9% accuracy, 99.8% precision, 99.7% recall, and 99.7% f1-score

Description

Keywords

Intrusion Detection System, Classification, Machine Learning, Neural Network, Anomaly Detection, Apache Spark

Citation

Collections