Applicability of Data Mining Techniques to Support Voluntary Counseling and Testing (VCT) for HIV: The Case of Ccnter for Discase Control and Prevention (CDC)
No Thumbnail Available
Date
2009-01
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Data mining is emerging as an important too l in many areas of research and industry.
Companies and organizations are increasingly interested in applying data mining tools to
increase the value added by their data collections systems. Nowhere is this potential more
important than in the healthcare industry. As medical records systems become more
standardized and commonplace, data quantity increases with much of it going
analyzed. Data mining can begin to leverage some of this data into tools that help
health organizations to organize data and make decisions.
Data related to HIV ) AIDS are available in VCT centers. A major objective of this thesis
is to evaluate the potential applicability of data mining techniques in VCT, with the aim
of developing a model that could help make informed decisions. Using the datasets
collected from OSSA, which is supported by CDC, and CRISP-OM as a knowledge
discovery process model findings of the research are presented using graphs and tabular
formats
For the clustering task the K-means and EM algorithms were tested U Sing WEKA.
Cluster generated by EM were appropriate for the problem at hand in generating similar
group. According to the results of these experiments it was possible to see similar groups
from VCT clients. The gender, martial status, and HIV test result, and education has
shown patterns.
For the class unification task, dices ion tree (J48 and Random tree) and neural network
(ANN) classifier are evaluated .Although AI\TN shows better accuracy than decision tree
classifier, the decision tree (J48) is appropriate for the datasets at hand and is used to build
the classification model. Finally, cluster-derived class unification models are tested for their
cross-validation accuracy and compared with non cluster generated classification ion model.
The outcomes of this research will serve users in the domain area, decision makers and
planners of HIV intervention program like CDC and MOH.