Predicting fertility rate in Ethiopia using data mining techniques
No Thumbnail Available
Date
2016-10
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
A.A.U
Abstract
Introduction: Fertility rates are at a very high levels in Africa and some Arabic countries, followed
next by the countries of Central and South America. Some of the social factors that can influence
fertility rates are: race, level of education, religion, use of contraceptive methods, abortion, impact
of immigration, etc. Data mining is a collection of techniques for efficient automated discovery of
previously unknown, valid, novel, useful and understandable patterns in large databases.
Objective: The main objective of this study is to apply data mining to predict fertility rate in
Ethiopia, particularly for four research centers named as Arbaminch DSS, Dabat DSS, Gilgel Gibe
DSS and Kilite Awelaelo DSS. This can greatly support for policy makers, planners, and
healthcare providers working on the control of fertility rate in Ethiopia.
Methods and Material: The methodology used for this research was a hybrid six-step Cios
Knowledge Discovery Process. The required data was collected from the data warehouse built for
this purpose that stores data from four different research centers for the period of 2007 - 2015. The
researcher used two popular data mining algorithms (C4 J48 Decision Trees and Naïve Bayes
Classifier) to develop the predictive model using a larger dataset (68,033 cases). The researcher
also used a 10-fold cross validation and 90% split test mode for data mining methods of the two
predictive models for performance comparison purposes.
Results: The results indicated that the decision tree (J48 algorithm) is the best predictor with
pruned parameter of the tree of 10-fold cross-validation mode; it has 76.4% accuracy on the
holdout dataset (this predictive accuracy is better than any reported in the literature), Naïve Bayes
Classifier came out to be the second with supervised discretization has 69% accuracy.
Conclusion: The results from this study confirmed the application of data mining for predicting
fertility rate in Ethiopia. In the future, more classification studies by using a possible large amount
of HDSS dataset with epidemiological information and employing other classification algorithms,
tools and techniques could yield better results.
Description
A Thesis Submitted to the School of Graduate Studies of Addis Ababa University in Partial Fulfillment of
The Requirements for the Degree of Master
Of Science in Information Science
Keywords
Data Mining Techniques, Fertility rate