Mining E-Filing Data For Predicting Fraud: The Case Of Ethiopian Revenue And Custom Authority
No Thumbnail Available
Date
2017-06-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Nowadays the technological advancement is at improving stage. These technological advancements have
their own side effect (loop hole) on the growing economy and the taxation system of a nation. Fraud is one
of the risks in this digital environment of tax. Beside the technological advancement, the controlling and
monitoring environment is necessary.
In this study, experiments were conducted by strictly following the six step Cios et al. (2000) process
model. It start from business understanding in ERCA taxation system and fraud, specifically on E-filed data
set. By taking the data from database of ERCA and understanding of the data with the help of domain expert
and literature. In data preprocessing; inconsistencies, missing value, outliers and related issue handled
properly. After that, construction of models and analysis of the result done to facilitate decision making in
the business risk analysis.
For this study, used a total of 2954 records to training the classifier model. Experiment on deferent
classification algorithms including J48, random forest and multilayered perception algorithms were done.
We have compared the result of the various models to find the best model using 10-fold cross validation
and percentage split (66/34%) evaluation methods.
The study, finds that J48 classification algorithm performs with best accuracy when cross checked with
deferent testing mechanisms. J48 recorded an accuracy of 94.72% where 2798 instances are correctly
classified out of 2954 test cases. Future research directions are also forwarded to come up with an
applicable system in the area of the study.
Description
Keywords
Data Mining, e-filing, classification, j48, fraud.