Predicting fertility rate in Ethiopia using data mining techniques

No Thumbnail Available

Date

2016-10

Journal Title

Journal ISSN

Volume Title

Publisher

A.A.U

Abstract

Introduction: Fertility rates are at a very high levels in Africa and some Arabic countries, followed next by the countries of Central and South America. Some of the social factors that can influence fertility rates are: race, level of education, religion, use of contraceptive methods, abortion, impact of immigration, etc. Data mining is a collection of techniques for efficient automated discovery of previously unknown, valid, novel, useful and understandable patterns in large databases. Objective: The main objective of this study is to apply data mining to predict fertility rate in Ethiopia, particularly for four research centers named as Arbaminch DSS, Dabat DSS, Gilgel Gibe DSS and Kilite Awelaelo DSS. This can greatly support for policy makers, planners, and healthcare providers working on the control of fertility rate in Ethiopia. Methods and Material: The methodology used for this research was a hybrid six-step Cios Knowledge Discovery Process. The required data was collected from the data warehouse built for this purpose that stores data from four different research centers for the period of 2007 - 2015. The researcher used two popular data mining algorithms (C4 J48 Decision Trees and Naïve Bayes Classifier) to develop the predictive model using a larger dataset (68,033 cases). The researcher also used a 10-fold cross validation and 90% split test mode for data mining methods of the two predictive models for performance comparison purposes. Results: The results indicated that the decision tree (J48 algorithm) is the best predictor with pruned parameter of the tree of 10-fold cross-validation mode; it has 76.4% accuracy on the holdout dataset (this predictive accuracy is better than any reported in the literature), Naïve Bayes Classifier came out to be the second with supervised discretization has 69% accuracy. Conclusion: The results from this study confirmed the application of data mining for predicting fertility rate in Ethiopia. In the future, more classification studies by using a possible large amount of HDSS dataset with epidemiological information and employing other classification algorithms, tools and techniques could yield better results.

Description

A Thesis Submitted to the School of Graduate Studies of Addis Ababa University in Partial Fulfillment of The Requirements for the Degree of Master Of Science in Information Science

Keywords

Data Mining Techniques, Fertility rate

Citation