Application of Data Mining Technology to Predict Child Mortality Patterns: The Case of Butajira Rural Health Project (Brhp)
No Thumbnail Available
Date
2002-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Traditionally, very simple statistical techniques are used in the analysis of
epidemiological studies. The predominant technique is logistic regression, in which
the effects predictors are linear. However, because of their simplicity, i.t is difficult to
use these models to discover unanticipated complex relationships, i.e., non-linearities
in the effect of a predictor or interactions between predictors. Specifically, as the
volume qj data increases, the traditional methods will become inefficient and
impractical. This in turn calls the application of new methods and tools that can help to
search large quantities of epidemiological data and to discover new patterns and
relationships that are hidden in the data. Recently, to address the problem of
identifying useful information and knowledge to support primary healthcare prevention
and control activities, health care institutions are employing the data mining approach
which uses more flexible models, such as, neural networks and decision trees, to
discover unanticipated features from large volumes of data stored in epidemiological
databases.Particularly, in the developed world, data mining technology has enabled health care
institutions to identify and search previously unknown, actionable information from
large health care databases and to apply it to improve the quality and efficiency of
primary health care prevention and control activities. However, to the knowledge of
the researcher, no health care institution in Ethiopia has used this state of the art
technology to support health care decision-making.Thus, this research work has investigated the potential applicability of data mining
technology to predict the risk of child mortality based up on community-based
epidemiological datasets gathered by the BRHP epidemiological study.
The methodology used for this research had three basic steps. These were collecting
of data, data preparation and model building and testing. The required data was
selected and extracted from the ten yea rs surveillance dataset of the BRHP
epidemiological study. Then, data preparation tasks (such as data transformation,
deriving of new fields, and handling of missing variables) were undertaken. Neural
network and decision tree data mining techniques were employed to build and test the
models. Models were built and tested by using a sample dataset of 1100 records of
both alive and Died children.Several neural network and decision tree models were built and tested for their
classification accuracy and many models with encouraging results were obtained. The
two data mining methods used in this research work have proved to yield comparably
sufficient results for practical use as far as misclassification rates come into
consideration. However, unlike the neural network models, the results obtained by
using the decision tree approach provided simple rules that can be used by nontechnical
health care professionals to identify cases for which the rule is applicable.In this research work, the researcher has proved that an epidemiological database
could be successfully mined to identify public health and sociology-demographic
determinants (risk factors) that are associated with infant and child mortality in rural
communities
Description
Keywords
Information Science