Predicting HIV Infection Risk Factor Using Voluntary Counseling and Testing Data: a Case of African Aids Initiative International

No Thumbnail Available




Aweke, Girma

Journal Title

Journal ISSN

Volume Title


Addis Ababa University


Despite a great deal of efforts, the world still has neither a cure nor a vaccine for HIV/AIDS infection. Millions of people have been suffering from this incurable disease. Fortunately, researchers have become successful in prolonging and improving the quality of life of those infected with HIV. Nonetheless, it has become increasingly clear that preventing the transmission and the acquisition of HIV through educating people to bring about behavioral changes should be the focus. The widely and freely available voluntary counseling and testing center (VCT) in Addis Ababa which provides an enormous role in counseling, promoting and checking clients HIV status through the clinical laboratory test. In line with this, the center has been collecting client’s information or records for further investigation with confidentiality. The record consists of many attribute that may have a direct or indirect impact with HIV infection. Moreover, identifying HIV infection risk factors or determinate variables provides benefits at different level of the society (such as individual, community and organizational level). The benefit not yet known by the client rather the organization keeps their records after they got tested. To this end, great efforts have been made to develop models to identify HIV infection risk factor using data mining technology. This research is initiated to identify the determinant risk factors of HIV infection by developing predictive models to support voluntary counselling and testing service of African AIDS Initiatives international (AAII) provided at Addis Ababa University and its surrounding. The six steps hybrid methodology has been followed for predictive HIV infection risk factors modeling among selected attributes. Three classifications techniques such as Decision tree J48, PART and SMO algorithms were experimented for building and evaluating the models. Before experimentation data pre-processing task has been performed to remove outliers, fill in missing values, and select best attributes, discretization and transformation of data. The preprocessing phases took considerable time of this work. A total of 15,396 VCT client records have been used for training the models, while a separate 3,000 records were used for testing their performance. The model developed using the PART algorithm has shown the best classification accuracy of 96.7%. The model has been evaluated on the testing dataset and scores a prediction accuracy of 95.8%. The results of this study have shown that the data mining techniques were valuable for predicting HIV infection risk factors. Hence, future research directions are forwarded to come up applicable solutions in the area of the study.



Data Mining Technology