Health Informatics
Permanent URI for this collection
Browse
Browsing Health Informatics by Title
Now showing 1 - 20 of 232
Results Per Page
Sort Options
Item Acceptance and Use of E-Library Services in Ethiopian Universities: The Case of Addis Ababa and Adama Universities(Addis Ababa University, 2010-07) Ali, Abinew; Beshah, Tibebe (PhD)The introduction of new information and communication technologies in libraries during the last two decades has altogether changed the concepts of libraries where library patrons need not necessarily go to the library physically since the library services and resources are no longer confined to the walls of the physical library. However, for these e-library services to be utilized effectively and efficiently, library end users must accept and use them. This study was mainly conducted to empirically investigate the determinants of elibrary end users acceptance and use in academic libraries within Ethiopian context. The study had applied the SO-UTAUT technology acceptance model which is appropriate in a library context. Cross sectional survey research method was used to capture the data. Questionnaire survey was employed to collect data from both Adama University and Addis Ababa University postgraduate students and academic staffs. SPSS and PLS graph beta testing software were used to analyze the data. Hence, Descriptive statistics and Structural Equation Modeling techniques using PLS were applied for analysis. The study found out performance expectancy as a major determinant factor which demonstrated the most significant contribution (36.2%) on behavioral intention to use elibrary services. Moreover, behavioral intention has shown to be the core determinant factor (40.2% contribution) for the actual usage behavior. Awareness has demonstrated a significant moderating effect on relevancy and facilitating condition constructs. The SO-UTAUT model has been proved to be valid in Ethiopian context, since it can explain 22.2% of the variance on behavioral intention, 29.9% of the variance on behavioral usage and 52.2% of the variance on expected benefits of e-library services in the users’ acceptance and use behaviors of the services.Item Adoption of Electronic Medical Records among Health Professionals at Public Hospitals in Addis Ababa City Administration Health Bureau, Ethiopia(Addis Ababa University, 2012-12) Gebremariam Semere; Lamenew Workshet (PhD); Deyassa Negussie (PhD)INTRODUCTION: Wellness and health are central to live of all people of age group. Incorporating information communication like Electronic Medical Records on the health care industries is mandatory for the better improvement of patient care and safety, integrated research, for effective planning, monitoring and evaluation of disease etc. Electronic Medical Record implementation in public hospitals in Addis Ababa is on the infant stage not more than three years since its inception. Even though There is discrepancy in adoption among health professionals and is not utilized as needed due to different factors ,most of the public hospitals have implemented it. So identifying the factors which affect the adoption will help to apply proactive measure and correction so as to increase the adoption of EMR among health professionals whom are working at the public hospitals. OBJECTIVE: This study aimed at identifying the factors that affect the Behavioral Intention and usage behavior of Electronic Medical Record and determine the utilization status among health professionals working in public hospital in Addis Ababa City Administration health Bureau. METHOD: A Cross-sectional survey was carried out among health professionals working at public hospitals in Addis Ababa using modified theory of unified acceptance and use of technology (UTAUT) model. Four hundred eight health professionals who had training on EMR were interviewed at the five public hospitals. RESULTS: The utilization of EMR among health professions working at the public hospitals was 51.7%. Performance expectancy, Effort expectancy, social influence were factors influencing the behavioral intention of health professionals to adopt EMR and Behavioral intention was also significant influencing factor on actual usage behavior. Facilitating condition remains insignificant on the actual usage behavior of EMR among health professionals. CONCLUSION AND RECOMMENDATION: The utilization rate of EMR was 51.7%: Having no experience, misunderstanding on the relative advantage, perceiving complexity of the system, inadequate support of the top managers, low behavioral intent were factors associated with the behavioral intention and actual usage of EMR. This study indicates that the necessity of integrating health management information system with the daily health care activities and development of health information policy that can scale the utilization rate.Item Afaan Oromo –Amharic Cross Lingual Information Retrieval: Acorpus Based Approach(Addis Ababa University, 2013-06) Nigussie, Eyob; Dereje, Teferi (PhD)Ethiopia is a multi lingual country with over 80 distinct languages, and with a population size of more than 73.9 million as authorities estimated on the basis of the 2007 census (Bloor, 1995). In multilingual countries like Ethiopia it‟s not uncommon to see language barriers while seeking information in language other than ones mother tongue. Afaan Oromo (also known as „Oromiffa‟) is one of the languages that are widely used and spoken in Ethiopia by the Oromo people which account up to 36.7% of the total population (Commission, 2008). Currently Afaan Oromo is an official language of Oromia regional state. On the other hand, the current official language of Federal Democratic Republic of Ethiopia is Amharic. However, there are people who are not fluent enough to create Amharic query terms but need Amharic documents for different reasons. An IR system capable of breaking language barrier in retrieval of information would clearly be helpful for such a user. This study is therefore aimed at designing and developing a corpus based Afaan Oromo–Amharic cross lingual information retrieval system so as to enable Afaan Oromo speakers to retrieve Amharic information using Afaan Oromo queries. The approach selected to be followed in the study is corpus based, particularly parallel corpus. For this study parallel documents including news articles, bible, legal documents and proclamations from customs authority were used. The system is tested with 50 queries and 50 randomly selected documents. Two experiments were conducted, the first one by allowing only one possible translation to each Afaan Oromo query term and the second by allowing all possible translations. The retrieval effectiveness of the system is measured using recall and precision for both monolingual and bilingual runs. Accordingly, the first experiment returned a maximum average precision of 0.81 and 0.45 for monolingual (Afaan Oromo queries) and bilingual (translated Amharic queries) run. The result of the second experiment showed better result of recall and precision than the first experiment. The result obtained in the second experiment is a maximum average precision of 0.60 for the bilingual run and the result for the monolingual run remained the same. From these results, it can be concluded that, cross lingual information retrieval for two local languages namely Afaan Oromo and Amharic could be developed and the performance of the retrieval system could be increased with use of larger and clean corpora. Key Words: Afaan Oromo-Amharic Cross-Lingual Information Retrieval, Information Retrieval, Afaan Oromo, AmharicItem Afaan Oromo-English Cross-Lingual Information Retrieval (Clir): A Corpus Based Approach(Addis Ababa University, 2011-06) Bekele, Daniel; Teferi, Dereje (PhD)The goal of Cross Language Information Retrieval (CLIR) is to provide users with access to information that is in a different language from their queries. It has the ability to issue a query in one language and retrieve documents in another. This is achieved by designing a system where a query in one language can be compared with documents in another. Afaan Oromo is one of the major languages that are widely spoken and used in Ethiopia. Despite the fact that Afaan Oromo has a large number of speakers, little effort has been put in conducting researches which aim at making English documents available to Afaan Oromo speakers. This study is, therefore, an attempt to develop Afaan Oromo-English CLIR system which enables Afaan Oromo native speakers to access and retrieve the vast online information sources that are available in English by writing queries using their own (native) language. In this study, the development of a corpus-based CLIR system which makes use of wordbased query translation for Afaan Oromo-English language pairs and evaluation of the system on a corpus of test documents and queries prepared for this purpose is described. This approach requires the availability of parallel documents hence such documents are collected from Bible chapters, legal and some available religious documents. Evaluation of the system is conducted by both monolingual and bilingual retrievals. In the monolingual run, the Afaan Oromo queries are given to the system and Afaan Oromo documents are retrieved while in the bilingual run the Afaan Oromo queries are given to the system after being translated into English to retrieve English documents. For the bilingual run translation of Afaan Oromo queries into their English equivalent is done by using bilingual dictionary constructed from the collected parallel corpora. The performance of the system was measured by recall and precision. In the first phase of the experimentation, the maximum average precision value of 0.421and 0.304 are obtained for the Afaan Oromo and English documents respectively. The second phase of experimentation performs slightly better than the first. Maximum average precision value of 0.468 and 0.316 are obtained for the Afaan Oromo and English documents respectively. Therefore, with the use of large and cleaned parallel Afaan Oromo-English document collections, it is possible to develop CLIR for the language pairs.Item Afan Oromo news text summarizer(Addis Ababa University, 2012-06) Debele, Girma; Yifiru, Martha (PhD)Information overload is a global problem that requires solution. Automatic text summarization is one of the natural language processing technologies that have got researchers focus to help information users. It is a computer program that summarizes a text. A summarizer removes redundant information from the input text and produces a shorter non-redundant output text. In this study, a generic automatic text summarizer for Afan Oromo news text has been developed based upon the Open Text Summarizer (OTS). OTS summarizes texts in English, German, Spanish, Russian, Hebrew, Esperanto and other languages. For this master’s thesis most of the work done is customizing the OTS code so that it can make use of the Afan Oromo lexicons and work for the Afan Oromo language. The summarizer basically uses the combinations of term frequency and sentence position methods with language specific lexicons in order to identify the most important sentence for extractive summary. In this study we have developed three methods for Afan Oromo news text summarization and tested their performance both objectively and subjectively. These three summarizers are: M1 that uses term frequency and position methods without Afan Oromo stemmer and other lexicons (synonyms and abbreviations), M2 is a summarizer with combination of term frequency and position methods with Afan Oromo stemmer and language specific lexicons (synonyms and abbreviations) and M3 is with improved position method and term frequency as well as the stemmer and language specific lexicons (synonyms and abbreviations). The performance of the summarizers was measured based on subjective as well as objective evaluation methods. The result of objective evaluation shows that the three summarizers: M1, M2 and M3 registered f-measure values of 34%, 47% and 81% respectively i.e. M3 outperformed the two summarizers ( M1 and M2 ) by 47% and 34 % . Moreover, the subjective evaluation result shows that the three summarizers’ (M1, M2 and M3) performances with informativeness, linguistic quality and coherence and structure are: (34.37 %, 37%, and 62.5%), (59.37%, 60% and 65%) and (21.87%, 28.12% and 75%) respectively as it is judged by human evaluators. In both subjective and objective evaluation, the results are consistent. Summarizer M3 that uses the combination of term frequency and improved position methods outperform other summarizers followed by M2.Item Amharic Character Recognition System for Printed Real-Life Documents(Addis Ababa University, 2010-06) Teshager, Abay; Mulugeta, Wondwossen (PhD)Optical Character Recognition (OCR) is an area of research and development where a system is made to recognize characters from printed documents. Cultural considerations and enormous flood of printed documents motivated the development of OCR across the world. Unlike other scripts, OCR development for Amharic Characters has been started in 1997 at SISA (School of Information Studies for Africa). Some developments have been made in recognizing various types of machine-printed, typewritten and handwritten Amharic documents. However, Amharic character recognition is still an area that requires the contribution of many research works. There is a need to enhance its performance on real-life documents such as the ‘Addis Zemen’ Amharic newspaper, the Bible, the ‘Federal Negarit Gazeta’ and the fiction ‘Fiker Eskemekabir’, which have a number of artifacts (mode of writing, condition of the input page, printing process, quality of paper, presence of extraneous markings, resolution and quality of scanning etc.) that affect the performance of the recognizer. One such area, OCR technology has been investigated more for real-life Amharic degraded documents. For the recognition to be successful, robust techniques in detecting and removing various noise types are investigated and validated. During experimentation of the applicability of algorithms and approaches for the problem at hand, MATLAB Image processing Toolbox and neural network classifier on MATLAB Neural Network Toolbox is used. The wiener adaptive filtering method for noise removal, Otsu global thresholdingmethod for binarizing the digitized image, linear interpolation techniques for normalization and hit-and-miss morphological analysis for thinning are found to work very well for the problem of interest. In due course, the performance of the line segmenter is found to be 100%. The rate of segmentation for basic and labialized characters turns out to be 98.28% and 100% respectively for training character sets, 98.55% and 100% respectively for testing character sets. For classifying the features generated, an artificial neural network approach is implemented. The neural network is trained with eight samples taken from real-life documents. The performance of the developed system is tested with documents taken from real-life documents. Accordingly, an average recognition rate of 96.87% for the test sets from the training sets and 11.40% recognition rate is observed for the new test sets. The segmentation algorithm used in the current study worked reasonably for basic and labialized characters. But it fails to segment special character |v|, punctuations and numbers. In general, observation of the test results show that the performance of the system is greatly affected by the similarity of the shape of Amharic characters and effect of the application of noise removal for cleaning highly degraded document images. Such challenges require to further explore an invariant to shape feature extraction techniques and advanced noise detection and removal algorithms. Based on the results, further research areas are also recommended.Item Amharic Document Image Retrieval without Explicit Recognition(Addis Ababa University, 2009-06) Worku, Mesfin; Meshesha, Million (PhD)Retrieval of the stored information is a key issue. Especially image retrieval needs an emphasis, because the nature of the data is complex and difficult to retrieve. There are many problems to be studied in the area of image retrieval. From these, Document Image Retrieval is one of the issues that have to be given attention. Document retrieval can use either a textual-based retrieval system or an image-based retrieval system. Document image retrieval system can also be done in two ways: recognition-based document image retrieval or document image retrieval without explicit recognition. Currently, little has been done on the Amharic document retrieval systems. The Amharic text retrieval systems which are covered by the researchers considered limited Amharic documents that are available only in hardcopy format. The proposed system incorporates document images and user queries. The document image is preprocessed, segmented at word level and the feature of each word is extracted. Then the textual query is rendered to convert into an image query, preprocessed, segmented and the feature is extracted. The technique used for feature extraction considers the word shape analysis. The extracted feature of the image query is matched with the feature of the document images, at word level using Euclidean and cosine similarity measures. Finally relevant document images are retrieved in ranked order in response to the given query. To verify the validity of the approach proposed, experiment is carried out on 121 scanned Amharic documents that are selected from printed legal documents and news items. The data retrieval effectiveness is measured using retrieval measures such as precision, recall and F-Score. The experimental results confirmed the validity of the model for retrieving relevant document images from the collection of scanned document images.Item Amharic Named Entity Recognition Using A Hybrid Approach(Addis Ababa University, 2014-08) Tadele, Mikiyas; Yifiru, Martha (PhD)Named Entity Recognition (NER) is a subcomponent of information extraction (IE) that detects and classifies named entities (NE) which, among others can be proper nouns representing person, location, and organization names and also date, time, and measurements. NER has been also found to be vital for other NLP applications, such as Information Retrieval, Question and Answering, Machine Translation, and Text summarization to mention a few. This research reports the performance of Amharic NER (ANER) built using the hybrid approach and different feature sets to detect and classify NEs of type person, location, and organization. Two state of the art machine learning (ML) algorithms, namely decision tree and support vector machines (SVM), are used to investigate the performance of the hybrid ANER. This is the first research that has used these ML algorithms for ANER and also the first research to explore ANER using the hybrid approach. The rule-based component of the hybrid ANER has been built using two rules that base their predictions on the presence of trigger words before and after NEs. The ML component is built using decision tree (J48) and SVM (libsvm). The hybrid ANER integrates those two components by using the NE class predicted from the rule-based component as a feature in the ML component. We have conducted different experiments to compare the performance of the hybrid approach with that of the pure ML approach by using different feature sets. From our experiments we have obtained a high performing model for both J48 and libsvm algorithms without using the rule-based feature but using POS feature with the nominal flag feature with an F-measure of 96.1% for J48 and 85.9% for libsvm. Based on the experimental results we have concluded that the pure ML approach with POS and nominal flag feature outperformed the hybrid approach. This is because the rule-based component used in the experiment uses only trigger words. Using rules prepared by linguists and gazetteers may improve the rule based component and consequently the hybrid ANER system. Keywords: Amharic Named Entity Recognition, Information Extraction, Decision tree, Support Vector Machine, Hybrid Named Entity Recognition System.Item Amharic Question Answering for list questions: A case of Ethiopian tourism(Addis Ababa University, 2013-06) Eshetu, Brook; Teferra, Solomon (PhD)A Question answering (QA) system searches a large text collection and finds a short phrase or sentence that precisely answers a user's question. To solve a QA problem, we might first turn to traditional IR techniques, which have been applied successfully to large scale text search problems. Alternatively, the Natural Language Processing (NLP) and Information Extraction (IE) communities have developed techniques for extracting very precise answers from text. QA research attempts to deal with a wide range of question types including: fact, list, definition, How, Why, hypothetical, semantically constrained, and cross-lingual questions. This research work focuses on list questions in closed domain Amharic questions answering (AQA). It applies the hypothesis, which states that answers to a list questions have same semantic entity class, answers that co-occur within the sentences of the documents are related to the target and the question and sentences containing the answers share similar context. In this research work, list questions are answered using five major modules. The function of these modules are (1) determining the answer type, (2) document retrieval, (3) Extract answer candidates from the document, (4) computing similarity value for each pair of candidate answers based on their co-occurrence within the sentence, (5) selecting final answers. The QA system is evaluated and the experimental results show that the system registered an average of 57.5% F-score. Nevertheless, the performance of the system is greatly affected by the number for documents in the document database and techniques applied. As a result, we managed to develop a prototype for Amharic listing QA system in the area of Ethiopian tourism. KEYWORDS: Amharic Question Answering, List Questions, Answer Type, Candidate Answers, Co-occurance, Question Answering Evaluation.Item Analyzing the Outbreak Surveillance and Response System in Ethiopia using Data Mining Techniques(Addis Ababa University, 2012-11) Mohammed Yimer; Abebe, Ermias (PhD); Addisse Adamu (PhD)The aim of this research work was to show the applicability of data mining techniques for the development of descriptive and predictive model to disease outbreak surveillance datasets in Ethiopia. To do that the three data mining applications such as classification, clustering and association rules mining were undertaken to explore the important applications to the datasets of the PHEM sectors from different perspectives. A total of 18600 records were collected and assessed from the data store of the surveillance system from the year 2004-2012G.C. After the preprocessing phase of knowledge discovery in databases of data mining application a total of 8796 records were prepared for data mining algorithms. From the total records filtered and prepared for model preparation 4703 were from the IDSR system dataset and the remaining 4093 records were taken from that of the PHEM dataset from the year 2004- 2008G.C. and 2009-2012G.C. respectively. The researcher analyzed two classification algorithms for the prediction of Epidemic typhus disease cases with decision tree J48 classifiers and Naïve Bayes classifiers. Finally the more performing algorithm has been taken for model development. From the results of the experiments done decision tree algorithm had a better performance to classify the disease cases in place and time setting. The accuracy rate of correctly classifying the Epidemic Typhus disease cases by the use of decision tree J48 algorithm was 87.44% whereas with Naïve Bayes classifier was 83.70%. The sensitivity and specificity test was also done for the two classifiers. The researcher also attempted to analyze the application of association rule mining to find some sort of correlation or patters among disease cases of the surveillance data. The attributes were selected only from the disease cases for the occurrence and nonoccurrence, which were collected in time and place bases. Here, Apriori association rule mining algorithm was run to find interesting patterns among the occurrence and co-occurrence of disease cases which were correlated together. The researcher used 20% for the minimum support and 90% for minimum confidence threshold before the application of the mining algorithm. The researcher took the combined (integrated) datasets for cluster analysis with the total numbers of 8796 records with 9 attributes. Simple K-Means clustering algorithm was used for the combined datasets since; the algorithm showed the grouping of disease cases with respect to time and place. In general data mining techniques were important and applicable in the classification, clustering and association rules model development for emerging and reemerging disease cases. But the datahas to have good quality with the inclusion of important attributes of variables for better prediction and description model development The results of the research, apart from its education purpose, were also used for the planning, preparedness, decision making, and disease control and prevention activities to the domain experts.Item Applicability of Data Mining Techniques to Support Voluntary Counseling and Testing (VCT) for HIV: The Case of Center for Disease Control and Prevention (CDC)(Addis Ababa University, 2009-01) Asmare, Biru; Abebe, Ermias (PhD)Data mining is emerging as an important tool in many areas of research and industry. Companies and organizations are increasingly interested in applying data mining tools to increase the value added by their data collections systems. Nowhere is this potential more important than in the healthcare industry. As medical records systems become more standardized and commonplace, data quantity increases with much of it going unanalyzed. Data mining can begin to leverage some of this data into tools that help health organizations to organize data and make decisions. Data related to HIV/AIDS are available in VCT centers. A major objective of this thesis is to evaluate the potential applicability of data mining techniques in VCT, with the aim of developing a model that could help make informed decisions. Using the dataset collected from OSSA, which is supported by CDC, and CRISP-DM as a knowledge discovery process model findings of the research are presented using graphs and tabular formats For the clustering task the K-means and EM algorithms were tested using WEKA. Cluster generated by EM were appropriate for the problem at hand in generating similar group. According to the results of these experiments it was possible to see similar groups from VCT clients. The gender, martial status, and HIV test result, and education has shown patterns. For the classification task, decision tree (J48 and Random tree) and neural network (ANN) classifier are evaluated .Although ANN shows better accuracy than decision tree classifier, the decision tree (J48) is appropriate for the dataset at hand and is used to build the classification model. Finally, cluster-derived classification models are tested for their cross-validation accuracy and compared with non cluster generated classification model. The outcomes of this research will serve users in the domain area, decision makers and planners of HIV intervention program like CDC and MOH.Item Application of Case-Based Reasoning for Anxiety Disorder Diagnosis(Addis Ababa University, 2012-06) Wassie, Getachew; Jemaneh, Getachew (PhD)Medical domains have been an application domain of choice for Artificial Intelligence since its founding years in expert systems. The reason for this application is the knowledge complexity presented by the domain, as well as the leading industry market share of healthcare. The success of Artificial Intelligence in different healthcare applications resulted in the emergence of case-based reasoning. The main focus of this research is on application of case-based reasoning in the domain of mental health, specifically the application on a case-based reasoning system for anxiety disorder diagnosis. The main goal of this research is developing a prototype case-based reasoning system that can give decision support for anxiety disorder diagnosticians at a different level of expertise. Overcoming the limitations of a rule-based knowledge base system such as incremental learning and specific knowledge acquisition are the instigation of this research. To achieve the goal of this research the literature has been thoroughly reviewed from both Artificial Intelligence sub-field of case-based reasoning and mental health more specifically anxiety disorder diagnosis literature. For the implementation of the prototype, successfully solved cases are acquired from Amanuel Mental Specialized Hospital. In addition, the main parameters are identified in consultation with anxiety disorder experts. Then, the implementation of the prototype using jCOLIBRI case-based reasoning framework is realized. Finally, testing of the prototype case-based reasoning system is done to evaluate the performance of the system. The testing of the prototype is performed from two sides. The first one is testing in terms of precision and recall and registered 71% and 82% respectively. In addition to this, the average solution similarity using methods Leave One Out evaluator and Hold Out evaluation achieved performance of 73% and 75.5% respectively. The second one is the performance of the system is evaluated by the potential users‟ of the system and achieved 83.2% performance.Item Application of Case-Based Reasoning in Legal Case Management: An Experiment with Ethiopian Labor Law Cases(Addis Ababa University, 2014-06) Alem, Abebaw; Yifru, Martha (PhD)The labor law domain is an important selection area in AI that is a field to make a machine (computer) simulate human like behavior to enhance consistency, reliable and timely decision making. KBS is one of the major subfield of AI that uses expert (knowledge) to solve a specific problem. CBR is one of the important applications of AI and/or KBS for manipulating previous knowledge (cases) in different areas. It is a problem solving paradigm that uses earlier experiences to solve new problems and is useful to humans when knowledge is incomplete and/or evidence is sparse in the domain of law. Thus, major goal of this study is to design a prototype application of case based reasoning for legal case management in the domain of Ethiopian labor law context at the Federal Supreme Court. Fifty legal texts were selected through document review and discussions from FSCE. These texts then are organized in attribute-values dimension and converted to plain text file for building case base. Attributes are represented using the CBR developmental tool jcolibri framework for implementing the four “Re’s” of CBR application tasks (Retrieve, Reuse, Revise and Retain). The performance of the prototype system is evaluated using the statistical analysis (precision and recall) and user acceptance (satisfaction) techniques. Thus, the system has a recall of 71% and precision of 86% performance. Moreover, the system was accepted by domain experts 86% of the time. It is concluded that the prototype system is applicable in judicial application to provide fast and quality services. However, general knowledge explanations are lacked and major challenges in CBR for this study when similar cases are not in the case base to solve a problem. So, investigating a prototype system in law domain with integration of CBR and RBR approaches with explanation facility for future investigation is important. Moreover, managing any types of law in a separate section might need more resources. So, an “all in one” CBR applications for the three law parts (criminal, labor and civil laws) is recommended to manage large cases and reduce the burdens. Keywords: Case-based Reasoning (CBR), Legal Case Management, Labor LawItem Application of Data Minig Technology to Identifay Risk Factors of Abortion Incidence and To Identify Their Association Rules: The Case of Marie Stops International Ethiopia Centers(Addis Ababa University, 2013-01) Ayele, Binyam; Jemaneh, Getachew (PhD)Background: In order to fill the gap in evidence based information, and help in programming for the reduction of maternal deaths due to unsafe abortion, Healthcare industry today generates huge amounts of complex data about patients, hospitals resources, disease diagnosis, electronic patient records, and medical devices. This large amount of data is a key resource to be analyzed and processed to extract hidden information and knowledge. Decision making process at the health care setting needs to be supported with more advanced technology including a computer based information system. Objective: This thesis intends to investigate the potential applicability of data mining technology to identify the major factors that result in abortion and to find their association Methods: A Hybrid Data Mining methodology is followed, which is a six-step knowledge discovery process. The data for this research obtained from MSIE in Addis Abeba, Ethiopia. The experiments carried out in this research using association mining algorithm apriori. On MSIE abortion report datasets, descriptive data summarization was taken to gain understanding of the data. Moreover, missing values, outliers data, data integration and transformation were managed at preprocess stage of hybrid process model. On the basis of subjective (opinions of domain experts) and objective (support and confidence) measures of interestingness, a number of rules having practical relevance or that can add to the current knowledge in the problem domain were identified. Results: The results from this study were encouraging, which strengthened the hypothesis that interesting patterns can be generated from MSIE abortion case database by applying one of the data mining techniques: association rule mining. Besides, the results were promising and encouraging especially in the eye of domain experts. Conclusion: The result thus obtained in this study is promising to apply data mining for identifying the risk factors of induced abortion and prevention. To make usable the knowledge extracted in this study, an attempt has made by selecting best association rules. Keywords: Key words: Data mining, Induced abortion, knowledge discovery, association rule, apriori algorithm.Item The Application of Data Mining in Credit Risk Assessment: The Case of United Bank Sc(Addis Ababa University, 2013-06) Tesfaye, Mengistu; Teferi, Dereje (PhD)Credit facilities and investments are the cornerstones of the growing economy of Ethiopia. United Bank being one of the former private banks has played its own role in the economy by rendering loan facilities to the individuals and companies which are running business in various sectors. The bank uses internal and NBE credit policies, procedures and strictly followed manuals in various levels of credit committees before disbursing loan to customers. However, there are total defaulters and inconsistent loan repaying customers which declines the profitability of the bank in particular and threatens the growing economy of the country in general. While fueling the sprinting economy in the country, minimizing the possible defaulters is the prime concern of the bank. Identifying customers and contracts which are more likely to be inconsistent loan payers or defaulters is an important issue. This data mining research has been carried out to identify trends of good and bad or NPL (non-performing loan) patterns from the historic data and build predictive Model to assist the management of the bank. This research has used the last 7 years credit data of United Bank and applied various preprocessing activities to clean the data. An experiment has been conducted using the CRISP-DM (2000) Model using WEKA tool. Different parameters of WEKAJ48 Decision tree and Naïve Bayes classification algorithm were applied. The model developed using the J48 decision tree algorithm has showed highest classification accuracy of 96.6%. Generally, the result of this study has showed that the application of data mining in Credit data can bring valuable input to assist the decision of credit committees and management.Item Application of Data Mining For Predicting Adult Mortality(Addis Ababa University, 2012-06) Hailemariam, Tesfahun; Meshesha, Million (PhD)Background: The fast-growing, tremendous amount of data, collected and stored in large and massive data repositories, has far exceeded human ability for comprehension without powerful tools. As a result, data collected in large data repositories become seldom visited. This in turn, calls the application of data mining technology. Every year, more than 7·7 million children die before their fifth birthday. However, over three times those of nearly 24 million adults die every year. Less attention has been given to adults which are the most productive phase of life for both economic and social ramification of families and countries. Objective: The general objective of this research is to construct adult mortality predictive model using data mining techniques so as to identify and improve adult health status using BRHP open cohort database. Methods: The hybrid model that was developed for academic research was followed. Dataset is preprocessed for missing values, outliers and data transformation. Decision tree and Naïve Bayes algorithms were employed to build the predictive model by using a sample dataset of 62,869 records of both alive and died adults through three experiments and six scenarios. Result: In this study as compared to Bayes, the performance of J48 pruned decision tree reveals that 97.2% of accurate results are possible for developing classification rules that can be used for prediction. If no education in family and the person is living in rural highland and lowland, the probability of experiencing adult death is 98.4% and 97.4% respectively with concomitant attributes in the rule generated. The likely chance of adult to survive in completed primary school, completed secondary school, and further education is (98.9%, 99%, 100%) respectively. Conclusion: The study suggests that education plays a considerable role as a root cause of adult death, followed by outmigration. Further comprehensive and extensive experimentation is needed to substantially describe the loss experiences of adult mortality in Ethiopia. Key words: BRHP data, Mortality, Adult, predictive model, J48 decision tree, Data Mining.Item Application of Data Mining Techniques for Customers Segmentation and Prediction: The Case of Buusaa Gonofa Microfinance Institution(Addis Ababa University, 2013-01) Reganie, Belachew; Kebede, Gashaw (PhD)Identifying customers which are more likely potential to a product and service offering is an important issue. In customers identification data mining has been used extensively to predict potential customers for a product and service. The final goal of this thesis is to build a model that helps to classify customers for Buusaa Gonofa microfinance institution product and service. Since there are no predefined classes, that describe the customers of the institution, the researcher uses clustering techniques that resulted in the appropriate number of clusters. Then, a predictive model was developed to predict potential customers. This predictive model achieved an accuracy of 99.95%. For modeling purpose, data was gathered from the institution head office. Since irrelevant features result in bad model performance, data preprocessing was performed in order to determine the inputs to the model. Thus, various data mining techniques and algorithms were used to implement each step of the modeling process and alleviate related difficulties. K-means was used as a clustering algorithm to segment customers‟ record into clusters with similar characters. Different parameters were used to run the clustering algorithm before reaching at segment that made business sense. J48 decision tree algorithm was used for classification purpose. In addition to those attributes that are believed by the experts to have high impact on customer segmentation, attributes value of loan amount have a big influence. Generally, the result of the study was encouraging, which reinforces the possible application of data mining solution to the microfinance industry, particularly, in customer segmentation and prediction in Buusaa Gonofa microfinance institution.Item Application of Data Mining Techniques on Antiretroviral Therapy (Art) Data: The Case of Adama and Asella Hospitals(Addis Ababa University, 2010-06) Urgessa, Teklu; Meshesha, Million (PhD)Human Immunodeficiency Virus/ Acquired Immunodeficiency Syndrome (HIV/AIDS) is of global as well as national concern today as it affects all people of the world regardless of sex, age, educational status, race and color. When we come to Sub-Saharan African region in general and Ethiopia in particular, the situation is even more worsening and needs special attention. Today more than 1 million people are living with HIV/AIDS in Ethiopia. The country has made a lot of efforts towards preventing and controlling of the disease. As a result, hundreds of thousands of people come to health facilities to get Counseling and testing services through Voluntary Counseling and Testing (VCT) and Antiretroviral Therapy (ART) programs. A lot of demographic and Clinical data is recorded about individuals taking the services. As these data is getting larger and larger, it is highly likely that there will be hidden, implicit and non trivial knowledge within the data, which might not be obtained by the traditional statistical analysis as well as report and query based database functionalities. There are various evidences that Data Mining (DM) helps the health care system to extract non-trivial and hidden knowledge which exists within the large volume of demographic and clinical data captured during the provision of services and that this knowledge is helpful for health administrators to target resources in the right directions for preventive and controlling activities, and clinicians to give safe and right treatment and saves humans’ lives. Therefore; the main objective of this research was to see the applicability of data mining techniques on ART data collected at facility level by taking the case of Adama and Asella Hospitals ART databases to identify important patterns related to determinant attributes and their values for Termination/ Continuity behavior of patient on ART care service. Various data preprocessing activities were made to come up with the dataset ready for model building. The researcher selected two DM functionalities (Classification and Association rules mining). Decision tree classification with J48 implementation with eight scenarios was experimented. Thirteen experiments with different parameters were made for association rule mining. Evalution of the models was performed by using for each DM functionality and scenarios used to model the dataset. Analysis of the model was made based on different criteria mainly using confusion matrix, accuracy measures,time of execution and tree complexity for decision tree classification models and number of rules generated, support and confidence for each scenario of the association rule The research showed encouraging results; that data mining techniques are of high potential in predicting determinant factors/attributes for termination/continuity behavior of ART care by the patients. Finally hidden patterns (knowledge) were extracted that will provide certain decision support information for concerned bodies, for ART programs intervention. To mention few, the result showed for example that those patients who were on ART stage and whose Functional status is bedridden and the year in which they began the service is before 1999 E.C are at high risk of terminating the ART care. Those patients whose ART stage is on ART, and whose functional Status is Ambulatory, and if they started the service before 1999 and their age is above 18 years then they have high chance to terminate the ART care. The study also showed certain hidden information that young people whose age is less than 18 years; have high chance of staying longer in ART care service. Patients terminate the service in shorter time at Asella hospital than at Adama hospital. Those who are jobless have high chance to stay in the care. The reason (s) for these hidden patterns is left open for future researches works. From comparisons done among the experimentations made, it was learned that those data mining techniques, which were experimented for this research are applicable on the ART dataset of the cases under investigation in general but generalized decision tree with pruning outperformed for classification purpose on the dataset in terms preciseness, providing general insight, Performances and accuracy measures with fair execution time and providing best interpretable patterns. Many association rules were obtained with minimum support of 30% and confidence 50% had provided optimum rules with acceptable patterns.Item Application of Data Mining Techniques to Customer Profile Analysis in the Ethiopian Electric Power Corporation(Addis Ababa University, 2011-06) Abebe, Hailemariam; Jemaneh, Getachew (PhD)Data mining is progressively used in information systems as a technology to support decision making activities within business processes. Electric power industries are being pushed to understand and quickly respond to the individual needs and wants of their customers due to the dynamic and highly competitive nature of the industry and customers. Customer Relationship Management (CRM) is the overall process of exploiting customer data and information, and using it to increase the revenue generated from an existing customer and attract new customers by creating good relationship with them accordingly. To implement CRM, electric power industries can use their customer databases to get a better understanding of their customers. And thus, to extract this important customer information from available databases, data mining techniques play a great role. In this research the applicability of clustering and classification data mining techniques to implement CRM in the Ethiopian Electric Power Corporation (EEPCo) have been explored within the approach of CRISP-DM process model. After understanding business objective of the corporation, customer profiles are collected, cleansed, transformed, integrated and finally prepared for experimenting with the clustering and classification algorithms to develop a model. The final dataset prepared for experimentation consists of 50000 customer records. The K-means clustering algorithm was used to segment customer records into clusters with similar behaviors. In the classification sub-phase, J48 decision tree and Naive Bayes algorithms were employed. Using the final dataset different clustering models at K values of 4, 5, and 6 with different seed values have been experimented and evaluated against their performances. Consequently, the cluster model at K value of 4 with seed size 1000 has shown a better performance. Finally, its output is used as an input for decision tree and Naive Bayes classification models. First the different classification models with decision tree and Naive Bayes algorithms are experimented with different parameters. Among these, a J48 decision tree model that showed a classification accuracy of 99.894% was selected. The results of this study were encouraging and confirmed the belief that applying data mining techniques could indeed support CRM activities at EEPCo. In the future, more segmentation and classification studies by using a possible large amount of customer records and employing other clustering and classification algorithms could yield better results.Item Application of Data Mining Techniques to Discover Cause of Under-five Children Admission to Pediatric Ward: The case of Nigist Eleni Mohammed Memorial Zonal Hospital(Addis Ababa University, 2012-06) Dileba, Temesgen; Teferi, Dereje (Associate Professor)Background: - Health care system is potential area to apply and take the advantage of data mining. Higher priority is given for the prevention and control of preventable disease at home or community level. However, for seriously ill children admissions should be facilitated in order to save the life of the child. Objectives: - The objective of this study is to apply data mining techniques on under five children dataset in developing a model that support the discovery of the causes for under-five children admission to pediatric ward. Methodology: - Cross industry standard process for data mining process model was applied. Major processes covered were business understanding, data understanding, data preprocessing, modeling and evaluation. Decision tree and artificial neural network algorithms were tested for classification tasks in Waikato Environment for Knowledge Analysis. Exploratory data analysis techniques, graphs and tabular formats for visualization and accuracy, true positive rate, false positive rate, receiver operating characteristic and the idea of experts were used for evaluation of the model. The dataset used was records in integrated registration log book in under-five outpatient department. Result: - The decision tree algorithm J48 has higher accuracy (94.77%), weighted true positive rate (94.7%), weighted false positive rate (5.3%), weighted receiver operating characteristics (0.99) and performs much faster than multilayer perceptron. According to interesting rules in J48 presenting complaint of not taking any food, fluid or breast feeding (98.32%), low weight for age without sunken eyes (92.31%) and very low weight for age but not in association with restless or irritable (98.33%) are among the cause of under-five children admission to pediatric ward without any consideration of health information management system admission disease classification. Conclusion: - In conclusion, encouraging results are obtained in classification tasks, data mining technique is applicable on pediatric dataset in developing a model that support the discovery of the causes of under-five children admission to pediatric ward. The outcome of this study serves primarily users in the domain area, decision makers and planners.