AAU-ETD AAU-ETD
 

Addis Ababa University Libraries Electronic Thesis and Dissertations: AAU-ETD! >
Faculty of Informatics >
Thesis - Information Science >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/3514

Title: KNOWLEDGE DISCOVERY FOR EFFECTIVE CUSTOMER SEGMENTATION: THE CASE OF ETHIOPIAN REVENUE AND CUSTOMS AUTHO
Authors: BELETE, Beyazen
Advisors: Ato Getachew Jemaneh
Keywords: Information science
Copyright: Jun-2011
Date Added: 30-Jul-2012
Publisher: AAU
Abstract: CRM is a process by which an organization maximizes customer satisfaction in an effort to increase loyalty and retain customers‟ business over their lifetimes. On the other hand, customer segmentation is the grouping of customers into different groups based on their common attributes and it is the main part of CRM. In order to analyze CRM data, one needs to explore the data from different angles and look at its different aspects. This should require application of different types of data mining techniques. Data mining finds and extracts knowledge hidden in corporate data warehouses. The aim of this study is to test the applicability of clustering and classification data mining techniques to support CRM activities for ERCA using the Cios et al. (2000) KDD process model. In this study, different characteristics of the ERCA customers‟ data were collected from the customs ASYCUDA database. Once the customers‟ data were collected, the necessary data preparation steps were conducted on it and finally a dataset consisting of 46748 records was attained. To segment customers, the K-means clustering algorithm was used. During the cluster modeling different experiments have been conducted using different cluster numbers (K=3, 4, 5, 6) and seed values. From the different experiments, the one which had better performance has been selected. Hence, the cluster model at K=5 had better performance and its output was used for the next classification modeling. The classification modeling was built by using J48 decision tree and multilayerperceptron ANN algorithms with 10-fold cross-validation and splitting (70% training and 30% testing) techniques. Among these models, a model which was built using J48 decision tree algorithm with default 10-fold cross-validation shows better performance which is 99.95% of overall accuracy rate; hence this model was selected. The results of this research were encouraging as very high classification accuracy has been obtained.
Description: A Thesis Submitted to the School of Graduate Studies of Addis Ababa University in Partial Fulfillment of the Requirements for the Degree of Master of Science in Information Science
URI: http://hdl.handle.net/123456789/3514
Appears in:Thesis - Information Science

Files in This Item:

File Description SizeFormat
BELETE BIAZEN.pdf1.12 MBAdobe PDFView/Open

Items in the AAUL Digital Library are protected by copyright, with all rights reserved, unless otherwise indicated.

 

  Last updated: May 2010. Copyright © Addis Ababa University Libraries - Feedback