Applicability of Data Mining Techniques to Support Voluntary Counseling and Testing (VCT) for HIV: The Case of Center for Disease Control and Prevention (CDC)
No Thumbnail Available
Date
2009-01
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Data mining is emerging as an important tool in many areas of research and industry. Companies and organizations are increasingly interested in applying data mining tools to increase the value added by their data collections systems. Nowhere is this potential more important than in the healthcare industry. As medical records systems become more standardized and commonplace, data quantity increases with much of it going unanalyzed. Data mining can begin to leverage some of this data into tools that help health organizations to organize data and make decisions.
Data related to HIV/AIDS are available in VCT centers. A major objective of this thesis is to evaluate the potential applicability of data mining techniques in VCT, with the aim of developing a model that could help make informed decisions. Using the dataset collected from OSSA, which is supported by CDC, and CRISP-DM as a knowledge discovery process model findings of the research are presented using graphs and tabular formats
For the clustering task the K-means and EM algorithms were tested using WEKA. Cluster generated by EM were appropriate for the problem at hand in generating similar group. According to the results of these experiments it was possible to see similar groups from VCT clients. The gender, martial status, and HIV test result, and education has shown patterns.
For the classification task, decision tree (J48 and Random tree) and neural network (ANN) classifier are evaluated .Although ANN shows better accuracy than decision tree classifier, the decision tree (J48) is appropriate for the dataset at hand and is used to build the classification model. Finally, cluster-derived classification models are tested for their cross-validation accuracy and compared with non cluster generated classification model.
The outcomes of this research will serve users in the domain area, decision makers and planners of HIV intervention program like CDC and MOH.
Description
Keywords
Testing (VCT) for HIV