Predictive Data Mining Technique in Insurance: (The Case of Ethiopian Insurance Corporation)
No Thumbnail Available
Date
2005-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
One of the important tasks that we face in real world application is the task of
classifying particular situation or events as belonging to a certain class. Risk
assessment in insurance policies is one of the many areas, which uses classification as
problem solving approach.In order to solve problems whose solution could be categorical, we must build
classification models. Data mining techniques have powerful tools for building models
thereby addressing the problems This research study addresses the issues, techniques, and feasibility of building and
deploying predictive model (s), which determines the risk exposure of individuals, i.e.
the study was based on claims data of Personal accident at Ethiopian Insurance
Corporation.Ethiopian Insurance Corporation classifies claims into two classes as Small claim and
Big claim. To meet the objectives of the research, I,600-dataset records, each records
having 23 attributes had been collected. But after the data is p reprocessed, the total
datasets records used for this study were reduced to 1543. And among the 23 attributes,
6 of them were selected by discussing with the insurance experts for final model
building.A close examination about the distribution of the data reveals that the data has an
imbalanced distribution, which affects the accuracy of the model in favor of the
dominant class, in this case the "Small" class. Thus, in order to solve this problem,
dataset balancing based on "PAcc_ Within" was taken and the result has shown that the
accuracy was improved by far. Beside this, the researcher has found that the frequency occurrence (class distribution)
of the values of an attribute has a great impact on the accuracy of the model.
Lastly, analyzing the economic impact of the models and a separate accuracy measures
for each class of risk category is used in order to compare the models and select one for
deployment. Accordingly a model, which was built using knowledge SEEKER
algorithm and balanced data partitioning, based on "PAcc_ Within" attribute was
selected as a final model. This model has an overall accuracy of 96.61 % and a
classification error of 0.29% and 27% for Small and Big risk claim classes respectively
Description
Keywords
Information Science