Predicting HIV Infection Risk Factor Using Voluntary Counseling and Testing Data: a Case of African Aids Initiative International
No Thumbnail Available
Date
2012-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Despite a great deal of efforts, the world still has neither a cure nor a vaccine for HIV/AIDS
infection. Millions of people have been suffering from this incurable disease. Fortunately,
researchers have become successful in prolonging and improving the quality of life of those
infected with HIV. Nonetheless, it has become increasingly clear that preventing the transmission
and the acquisition of HIV through educating people to bring about behavioral changes should be
the focus.
The widely and freely available voluntary counseling and testing center (VCT) in Addis Ababa
which provides an enormous role in counseling, promoting and checking clients HIV status
through the clinical laboratory test. In line with this, the center has been collecting client’s
information or records for further investigation with confidentiality. The record consists of many
attribute that may have a direct or indirect impact with HIV infection. Moreover, identifying HIV
infection risk factors or determinate variables provides benefits at different level of the society
(such as individual, community and organizational level).
The benefit not yet known by the client rather the organization keeps their records after they got
tested. To this end, great efforts have been made to develop models to identify HIV infection risk
factor using data mining technology.
This research is initiated to identify the determinant risk factors of HIV infection by developing
predictive models to support voluntary counselling and testing service of African AIDS Initiatives
international (AAII) provided at Addis Ababa University and its surrounding.
The six steps hybrid methodology has been followed for predictive HIV infection risk factors
modeling among selected attributes. Three classifications techniques such as Decision tree J48,
PART and SMO algorithms were experimented for building and evaluating the models.
Before experimentation data pre-processing task has been performed to remove outliers, fill in
missing values, and select best attributes, discretization and transformation of data. The preprocessing
phases took considerable time of this work. A total of 15,396 VCT client records have
been used for training the models, while a separate 3,000 records were used for testing their
performance. The model developed using the PART algorithm has shown the best classification
accuracy of 96.7%. The model has been evaluated on the testing dataset and scores a prediction
accuracy of 95.8%. The results of this study have shown that the data mining techniques were
valuable for predicting HIV infection risk factors. Hence, future research directions are forwarded
to come up applicable solutions in the area of the study.
Description
Keywords
Data Mining Technology