Application of Data Mining Techniques to Predict Customers’churn At Commercial Bank of Ethiopia

No Thumbnail Available

Date

2013-09

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

Data mining tools and techniques are being used to solve different types of problems in various industries. Predicting customers‘ churn is one of the areas where data mining can be applied. Customers‘ churn, which is the common measure of lost customers, is one of the major problems in industries such as banks where there is a fierce competition. By minimizing the number of churning customers companies can maximize their profit and sustainability. For this reason, customer retention is critical for a good marketing and a customer relationship management strategy. This paper presents the prediction of customers, who are prone to move to a competitor, in Commercial Bank of Ethiopia. The data of 13172 customers with 9 attributes and their corresponding 628,634 transactions with 10 attributes is collected from the bank. The CRISP-DM methodology is followed to conduct the data mining process. After the business is thoroughly analyzed and the goals are clearly identified, successive steps of a data preparation processes are undertaken. A dataset of 6045 instances and 18 attributes is prepared. A WEKA (Waikato Environment for Knowledge Analysis) tool is used for modeling. The dataset is partitioned into different sets of testing and training sets. As the proportion of the churn class is very small as compared to the active (non-churn) class, SMOTE (Synthetic Minority Oversampling Technique) has been applied to minimize the class imbalance problem. Three modeling techniques are used for predicting churn. These are J48, Logistic Regression, and Bagging. The training models are built using cross validation and tested for reliability by separate test sets. The models are evaluated by their F-Measure values (which is the harmonic mean of recall and precision). The results of the study show that J48 modeling technique is the best model with a performance of 94.8% followed by bagging (93.9%) and Logistic Regression (76.6%).

Description

Keywords

Data Mining Techniques to Predict Customers’churn

Citation