Flow-Based E-mail Spam Detection

Hailu, Zelalem

Flow-Based E-mail Spam Detection

dc.contributor.advisor	Libsie Phd, Mulugeta
dc.contributor.author	Hailu, Zelalem
dc.date.accessioned	2022-02-11T12:09:19Z
dc.date.accessioned	2023-11-04T12:22:27Z
dc.date.available	2022-02-11T12:09:19Z
dc.date.available	2023-11-04T12:22:27Z
dc.date.issued	2011-11
dc.description.abstract	The volume of unsol icited commercial e-mai ls, also known as spam, is in such a rapid increase that almost over 90% of all e-mail messages are spam. We are in a state where an average of200 bill ion e-mail spamsare sent eachday. This problem is exacerbated by the fact that many of these spams contain some sort of malicious code for attack. In addition to wasting of users' time and attack threats, the huge amount of spam also consumes bandwidth and storage spaces illegally. There have been efforts over the years to combat spam messages. The most popular ones arc based on e-mail content analysis and IP address reputation. Techniques based on e-mail content analysis arc fall ing behind because of spammers' ability to trick such filters using legitimate e-mail-like words in their contents. The introduction of image and PDF spams is also another headache for content based filters. Fi lters based on IP add ress reputat ion are also not coping well with the spammers because of the dynamic nature of II) addresses and the difficulty of hunting down malicious addresses before significant damages are donc. Our approach is to filter out spam messages before they are delivered to the user's inbox based on packet flow characteristi cs. This is a complimentary approach that can be used with other techniques to reduce the number of spam messages reaching users' inbox. Our approach is based on over 55,000 packet flow records. We have identified nine features that best different iate spam from legitimate e-mail. Based on these attributes and a classification model with an accuracy of 99.5% and a fal se-positive of 2.6%, we have developed a ranking algorithm that scores a given flow into one of five categories. Based on these scores, a given packet flow will be accepted, rejected or will be passed for further examination by other tech niques. In addition to giving the advantage of not rel ying on e-mail content or IP address to filter spam, our method also avoids the wastage of resources like bandwidth and storage space by spam messages.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/30018
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Network flow, E-mail spam, Feature selection, Classification, Ranking algorithm	en_US
dc.title	Flow-Based E-mail Spam Detection	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zelalem Hailu.pdf
Size:: 22.32 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science