Mining Patients’ Data for Effective Tuberculosis Diagnosis: The Case of Menelik Ii Hospital
No Thumbnail Available
Date
2012-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
TB is a common and deadly infectious disease that can occur at any age. The incidence of TB has more than doubled in Africa during the last two decades. Ethiopia is also ranked 8th among the 22 countries with the highest TB burden in the world.
On the other hand, the advances in computing and information storage have provided vast amounts of data. However, the challenge has been to extract knowledge from this raw data. This has lead to new methods and techniques such as data mining that can bridge the knowledge gap.
Data mining can be used to model health care problems. This research aimed to apply data mining techniques to patients’ data to establish meaningful relationships or patterns for effective TB diagnosis. In general, 7069 data sets were extracted from Menelik II hospital. The data set contains patients’ detail information. The research establishes whether patients’ data are classified using various data mining techniques for predicting purpose.
The research specifically look at the use of clustering algorithm followed by classification for a data mining approach to help identify patients patterns and speed up the process of TB diagnosis system. The study has tried to apply k-means clustering with some enhancements to aid in the process of segmenting the dataset into TB-positive and TB-negative. The resulting cluster is then used for developing the classification model. Classification was employed in the study to identify patterns and predict the occurrence of TB. The classification task was made using J48 decision tree and Naïve Bayes classification algorithms and different experimentations was conducted. The model developed for predicting purpose has an accuracy of 85.93%. The discovered knowledge from the J48 decision tree is presented by traversing the tree for the ease of understanding.
The outcome of the research can have many benefits, to the organization especially for TB diagnosis activities.
Description
Keywords
Data for Effective Tuberculosis Diagnosis