Skip navigation
 

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/2834
Title: MINING INSURANCE DATA FOR FRAUD DETECTION: THE CASE OF AFRICA INSURANCE SHARE COMPANY
???metadata.dc.contributor.*???: Dr. Million Meshesha
TARIKU, ADANE
Keywords: Information science
Issue Date: 30-Jul-2012
Publisher: aau
Abstract: The insurance industry has historically been a growing industry. It plays an important role in insuring the economic well being of one country. But ever since it’s beginning as a commercial enterprise, the industry is facing difficulties with insurance fraud. Insurance fraud is very costly and has become a world concern in recent years. Fraudulent claims account for a significant portion of all claims received by insurers, and cost billions of dollars annually. Nowadays, great efforts have been made to develop models to identify potentially fraudulent claims for special investigations using the data mining technology. This study is initiated with the aim of exploring the potential applicability of the data mining technology in developing models that can detect and predict fraud suspicious in insurance claims with a particular emphasis to Africa Insurance Company. The research has tried to apply first the clustering algorithm followed by classification techniques for developing the predictive model. K-Means clustering algorithm is employed to find the natural grouping of the different insurance claims as fraud and non-fraud. The resulting cluster is then used for developing the classification model. The classification task of this study is carried out using the J48 decision tree and Naïve Bayes algorithms in order to create the model that best classify fraud suspicious insurance claims. The experiments have been conducted following the six-step Cios et al. (2000) process model. For the experiment, the collected insurance dataset is preprocessed to remove outliers, fill in missing values, select attributes, integrate data and derive attributes. The preprocessing phase of this study really took the highest portion of the study time. A total of 17810 insurance claim records are used for training the models, while a separate 2210 records are used for testing their performance. The model developed using the J48 decision tree algorithm has showed highest classification accuracy of 99.96%. This model is then tested with the 2210 testing dataset and scored a prediction accuracy of 97.19%. The results of this study have showed that the data mining techniques are valuable for insurance fraud detection. Hence future research directions are pointed out to come up with an applicable system in the area.
Description: A Thesis Submitted to the School of Graduate Studies of Addis Ababa University in Partial Fulfilment of the Requirements for the Degree of Master of Science in Information Science
URI: http://hdl.handle.net/123456789/2834
Appears in Collections:Thesis - Information Science

Files in This Item:
File Description SizeFormat 
TARIKU ADANE.pdf2.19 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.