Application of Data Mining Techniques for Customer Segmentation in Isurance Business: the Case of Ethiopian Insurance Corporation

No Thumbnail Available



Journal Title

Journal ISSN

Volume Title


Addis Ababa University


The aim of this study is to apply data mining techniques in insurance business to build models that can segment customers based on their value. The study subject for this research is Ethiopian Insurance Corporation, which stores life insurance policy holders‘ data in LIFE INSIS database located at Life Addis District were selected To meet the aforementioned objective of the study, the CRISP-DM methodology, which involves six steps was adopted to undertake data mining process and to address the business problem systematically and iteratively. During the business understanding phase, business practices of EIC life insurance were assessed using interviews with business and technical experts, and document analysis. Through data understanding and preparation phases, information on the subject of policyholders‘ personal, demographic, policy coverage and transactional was taken in to account. Besides, the attributes selected were considered the degree of relevancy to develop value-based customer segmentation model using DM techniques. Accordingly, from LIFE INSIS database, 27845 records and 16 attributes were imported MS-excel. The data used in this study were related to one year (12 months) of customer interactions that found between August, 2011 to August, 2012 time-frame. Attributes such as occupation ID, marital status, and sector were removed because they showed high Missing Values. The preprocessing tasks such as handling outliers and noisy, data integration and data transformation were undertaken. And, customers‘ value was computed using individual policyholders‘ records that indicate their insured value, duration and the cost incurred attract them (agent commission) information. With consultation of experts, 7 attributes and 21622 records were included in the final datasets for modeling purpose the initial database To build the customer segmentation models, K-means clustering algorithm and J48 decision tree algorithms of WEKA implementations were selected to discover useful patterns and to analyse the data. K-means clustering algorithm was selected since it‘s capable to develop models that segment customers with similar characteristics while J48 Decision tree classification technique was applied due to its quite quality and articulacy to decipher the cluster models by assigning XI each record to the target variable. Besides, patterns revealed that DT models are very easy straightforward and useful to integrate with business practices, and understand the revealed clusters. As a result, the experiments made in build DT model revealed that attributes such as age and insured_value were automatically selected as best predictive attributes to split the datasets to sub-segments that have homogenous characteristics based on their value (high or low). The results of the research pointed out that the customer segmentation models built by using the combination of classification and clustering data mining techniques are necessary for the LAD and marketing department of EIC in order to identify the valuable segments of customers and other factors underlying variations of the customers‘ values



Ethiopian Insurance Corporation