Predictive Data Mining Technique in Insurance: (The Case of Ethiopian Insurance Corporation)

No Thumbnail Available

Date

2005-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

One of the important tasks that we face in real world application is the task of classifying particular situation or events as belonging to a certain class. Risk assessment in insurance policies is one of the many areas, which uses classification as problem solving approach.In order to solve problems whose solution could be categorical, we must build classification models. Data mining techniques have powerful tools for building models thereby addressing the problems This research study addresses the issues, techniques, and feasibility of building and deploying predictive model (s), which determines the risk exposure of individuals, i.e. the study was based on claims data of Personal accident at Ethiopian Insurance Corporation.Ethiopian Insurance Corporation classifies claims into two classes as Small claim and Big claim. To meet the objectives of the research, I,600-dataset records, each records having 23 attributes had been collected. But after the data is p reprocessed, the total datasets records used for this study were reduced to 1543. And among the 23 attributes, 6 of them were selected by discussing with the insurance experts for final model building.A close examination about the distribution of the data reveals that the data has an imbalanced distribution, which affects the accuracy of the model in favor of the dominant class, in this case the "Small" class. Thus, in order to solve this problem, dataset balancing based on "PAcc_ Within" was taken and the result has shown that the accuracy was improved by far. Beside this, the researcher has found that the frequency occurrence (class distribution) of the values of an attribute has a great impact on the accuracy of the model. Lastly, analyzing the economic impact of the models and a separate accuracy measures for each class of risk category is used in order to compare the models and select one for deployment. Accordingly a model, which was built using knowledge SEEKER algorithm and balanced data partitioning, based on "PAcc_ Within" attribute was selected as a final model. This model has an overall accuracy of 96.61 % and a classification error of 0.29% and 27% for Small and Big risk claim classes respectively

Description

Keywords

Information Science

Citation