AAU-ETD :: Browsing by Author "Meshesha, Million (PhD)"

Browsing by Author "Meshesha, Million (PhD)"

Now showing 1 - 20 of 60

Afaan Oromo Morphological Analysis a Hybrid Approach
(Addis Ababa University, 2021-12-13) Genna, Kedir; Meshesha, Million (PhD)
This study provides relatively detailed information on developing Afaan Oromo morphological analysis system. Morphological analyzer decomposes words into its components called morphemes and annotates those morphemes with grammatical information. Although the module uses machine-learning approach on morphological analysis, it used rule-based approach to segments words into its small components, morphemes. The developed prototype focused on inflectional forms of nominals (nouns and adjectives) and verbs since the two words classes are mostly the ones that undergo inflection, they determine the inflectional characteristics of the language. The protype was developed using python programming and Hidden Markov Model (HMM). The Viterbi algorithm is used to encode the HMM model. Then, the prototype was trained and tested using representative data. A corpus of size 4,320 nouns and 3,780 verbs are used to train the HMM model. Then the performance of the analyser was tested using 480 nouns and 420 verbs. Generally, the accuracy of the analyzer for nouns and verbs is 84.6 % and 82.9% respectively. The result of the experiment was quite satisfactory, which can be improved by incorporating simple grammatical constraints and contextual information (including information encoded in tonal system) to minimize the ambiguities, words root database to reduce errors during morphemes identification and additional data to emphasis the initial probability of the model. The key limitations in this effort are limited funding opportunities, scarcity gold standard and balanced annotated data sets and inherently multiple sources of ambiguity of the language at different levels.
Amharic Document Image Retrieval without Explicit Recognition
(Addis Ababa University, 2009-06) Worku, Mesfin; Meshesha, Million (PhD)
Retrieval of the stored information is a key issue. Especially image retrieval needs an emphasis, because the nature of the data is complex and difficult to retrieve. There are many problems to be studied in the area of image retrieval. From these, Document Image Retrieval is one of the issues that have to be given attention. Document retrieval can use either a textual-based retrieval system or an image-based retrieval system. Document image retrieval system can also be done in two ways: recognition-based document image retrieval or document image retrieval without explicit recognition. Currently, little has been done on the Amharic document retrieval systems. The Amharic text retrieval systems which are covered by the researchers considered limited Amharic documents that are available only in hardcopy format. The proposed system incorporates document images and user queries. The document image is preprocessed, segmented at word level and the feature of each word is extracted. Then the textual query is rendered to convert into an image query, preprocessed, segmented and the feature is extracted. The technique used for feature extraction considers the word shape analysis. The extracted feature of the image query is matched with the feature of the document images, at word level using Euclidean and cosine similarity measures. Finally relevant document images are retrieved in ranked order in response to the given query. To verify the validity of the approach proposed, experiment is carried out on 121 scanned Amharic documents that are selected from printed legal documents and news items. The data retrieval effectiveness is measured using retrieval measures such as precision, recall and F-Score. The experimental results confirmed the validity of the model for retrieving relevant document images from the collection of scanned document images.
Amharic Document Image Retrieval Without Explicit Recognition
(Addis Ababa University, 2009-06) Worku, Mesfin; Meshesha, Million (PhD); Inkpen, Diana (PhD)
Retrieval of the stored information is a key issue. Especially image retrieval needs an emphasis , because the nature of the data is complex and difficult to retrieve . There are many problems to be studied in the area of image retrieval. From these , Document Image Retrieval is one of the issues that have to be given attention Document retrieval can use either a textual-based retrieval system or animate based retrieval system. Document image retrieval system can also be done in two ways: recognition-based document image retrieval or document image retrieval without explicit recognition Currently , little has been done on the Amharic document retrieval systems . The Amharic text retrieval systems which are covered by the researchers considered limited Amharic documents that are available only in hardcopy format The proposed system incorporates document images and user queries . The document image is preprocessed , segmented at word le vela and the feature of each word is extracted . Then the textual query is rendered to convert into an image query, preprocessed , segmented and the feature is extracted . The technique used for feature extraction considers the word shape analysis . The extracted feature of the image query is matched with the feature of the document images , at word level using Euclidean and cosine similarity measures . Finally relevant document images are retrieved in ranked order in response to the given query. To verify the validity of the approach proposed , experiment is carried out on 121 scanned Amharic documents that are selected from printed legal documents and news items. The data retrieval effectiveness is measured using retrieval measures such as precision , recall and F-Score . The experimental results confirmed the validity of the model for retrieving relevant document images from the collection of scanned document images.
Applicability of Data Mining Techniques to Customer Relationship Management (Crm): The case of Ethiopian Telecommunications Corporation's (ETC) Code Division Multiple Access (CDMA) Telephone Service
(Addis Ababa University, 2009-01) Girma, Melaku; Meshesha, Million (PhD)
in this research the applicability of clustering and classification techniques of data mining on CRM the case of COMA telephone service of ETC have been explored within the framework of CR[SPOM model. The COMA COR data along with billing information and the customers' profiles are collected, cleansed, transform ed and integrated for experiment renting with the clustering models. The final datasets consists of [0,090 records on which different clustering models at K values o f 6, S, and 4 with different seed values have been experimented and evaluated against their performances. Hence, the cluster model at K value of 6 has shown a better performance. Consequently, its output is used as an input for the decision tree and ANN c lass ifi cation models. First the different classification models with J48 decision tree algorithm are experiment en ted with the IO-fold cross validation, and splitting the datasets o 80 % training an d 20 % testing, techniques by setting the cluster ind ex formed by the cluster model as dependent variable and the rest as independent variables. Among these models model that showplace ossification accuracy of 98.97% is selected . Similarly, different classification models of multilayered ptron ANN algorithm are carried out by Chang in g its hidden layer number of nodes a learning's rate parameters' value. A model with a classification accuracy of 98.62 % is chosen. Finally a comparison o f decision tree and ANN mo de ls in terms of the overall class unification accuracy, accuracy In classifying hi g h value customers, and accuracy in c lass glowing value customers ha ve been undertaken. Hence, the decision tree model has excelled in th ese evalu ation parameters and therefore selected as the best classifier for CRM applications. The result of this research is really encouraging as very high class if ication accuracy has been obtained. Besides, hi precede vision and recall in c lass unifying high and low value customers correctly have been achieved.
Application of Data Mining For Predicting Adult Mortality
(Addis Ababa University, 2012-06) Hailemariam, Tesfahun; Meshesha, Million (PhD)
Background: The fast-growing, tremendous amount of data, collected and stored in large and massive data repositories, has far exceeded human ability for comprehension without powerful tools. As a result, data collected in large data repositories become seldom visited. This in turn, calls the application of data mining technology. Every year, more than 7·7 million children die before their fifth birthday. However, over three times those of nearly 24 million adults die every year. Less attention has been given to adults which are the most productive phase of life for both economic and social ramification of families and countries. Objective: The general objective of this research is to construct adult mortality predictive model using data mining techniques so as to identify and improve adult health status using BRHP open cohort database. Methods: The hybrid model that was developed for academic research was followed. Dataset is preprocessed for missing values, outliers and data transformation. Decision tree and Naïve Bayes algorithms were employed to build the predictive model by using a sample dataset of 62,869 records of both alive and died adults through three experiments and six scenarios. Result: In this study as compared to Bayes, the performance of J48 pruned decision tree reveals that 97.2% of accurate results are possible for developing classification rules that can be used for prediction. If no education in family and the person is living in rural highland and lowland, the probability of experiencing adult death is 98.4% and 97.4% respectively with concomitant attributes in the rule generated. The likely chance of adult to survive in completed primary school, completed secondary school, and further education is (98.9%, 99%, 100%) respectively. Conclusion: The study suggests that education plays a considerable role as a root cause of adult death, followed by outmigration. Further comprehensive and extensive experimentation is needed to substantially describe the loss experiences of adult mortality in Ethiopia. Key words: BRHP data, Mortality, Adult, predictive model, J48 decision tree, Data Mining.
Application of Data Mining Techniques For Effective Customer Relationship Management of Micro finances: The Case of Wisdom Microfinance
(Addis Ababa University, 2009-08) Dibaba, Wakgari; Meshesha, Million (PhD)
Application of Data Mining Techniques for Effective Customer Relationship Management of Microfinances: The Case of Wisdom Microfinance
(Addis Ababa University, 2009-08) Dibaba, Wakgari; Meshesha, Million (PhD)
The proliferation of information and communication technologies enabled companies to deal with large quantities of data. Microfinances are one of such institutions that collect, process and store huge amounts of records from time to time and therefore deal with voluminous amount of data. On the other hand, Microfinances are facing problems in customer handling; the proportion of customers staying intact with the same microfinance as a customer is very less compared to potential customers. The WISDOM microfinance is facing such problem where most customers are churning/ shifting to other competitors after using the loan service once or few times only. The existing past and historic data could be actionable and usable for decision making process that improves customer relationship management with the help of data mining techniques. One of the various applications of data mining is in support of customer relationship management through pattern mining and uncovering regularities. This paper reports the study of application of data mining in microfinance that helps build a classification model which supports in prediction of a new borrowers status (highly privileged, moderately privileged or less privileged) during the loan decision making in the organization. A classification model is built based on the borrowers’ corpus data obtained from the WISDOM microfinance. Essential preprocessing activities have been applied to clean and make it ready for the Experimentation. Then experiments using J48 decision tree classifier of the WEKA 3.7.0 software have been conducted using the preprocessed dataset with different attributes and parameters setting in order to arrive at the optimal model. The classification model with the best accuracy level (78.502%) and relatively less number of leaves and tree size is constructed to predict the new customer class label (highly privileged, moderately privileged or less privileged)
Application of Data Mining Techniques on Antiretroviral Therapy (Art) Data: The Case of Adama and Asella Hospitals
(Addis Ababa University, 2010-06) Urgessa, Teklu; Meshesha, Million (PhD)
Human Immunodeficiency Virus/ Acquired Immunodeficiency Syndrome (HIV/AIDS) is of global as well as national concern today as it affects all people of the world regardless of sex, age, educational status, race and color. When we come to Sub-Saharan African region in general and Ethiopia in particular, the situation is even more worsening and needs special attention. Today more than 1 million people are living with HIV/AIDS in Ethiopia. The country has made a lot of efforts towards preventing and controlling of the disease. As a result, hundreds of thousands of people come to health facilities to get Counseling and testing services through Voluntary Counseling and Testing (VCT) and Antiretroviral Therapy (ART) programs. A lot of demographic and Clinical data is recorded about individuals taking the services. As these data is getting larger and larger, it is highly likely that there will be hidden, implicit and non trivial knowledge within the data, which might not be obtained by the traditional statistical analysis as well as report and query based database functionalities. There are various evidences that Data Mining (DM) helps the health care system to extract non-trivial and hidden knowledge which exists within the large volume of demographic and clinical data captured during the provision of services and that this knowledge is helpful for health administrators to target resources in the right directions for preventive and controlling activities, and clinicians to give safe and right treatment and saves humans’ lives. Therefore; the main objective of this research was to see the applicability of data mining techniques on ART data collected at facility level by taking the case of Adama and Asella Hospitals ART databases to identify important patterns related to determinant attributes and their values for Termination/ Continuity behavior of patient on ART care service. Various data preprocessing activities were made to come up with the dataset ready for model building. The researcher selected two DM functionalities (Classification and Association rules mining). Decision tree classification with J48 implementation with eight scenarios was experimented. Thirteen experiments with different parameters were made for association rule mining. Evalution of the models was performed by using for each DM functionality and scenarios used to model the dataset. Analysis of the model was made based on different criteria mainly using confusion matrix, accuracy measures,time of execution and tree complexity for decision tree classification models and number of rules generated, support and confidence for each scenario of the association rule The research showed encouraging results; that data mining techniques are of high potential in predicting determinant factors/attributes for termination/continuity behavior of ART care by the patients. Finally hidden patterns (knowledge) were extracted that will provide certain decision support information for concerned bodies, for ART programs intervention. To mention few, the result showed for example that those patients who were on ART stage and whose Functional status is bedridden and the year in which they began the service is before 1999 E.C are at high risk of terminating the ART care. Those patients whose ART stage is on ART, and whose functional Status is Ambulatory, and if they started the service before 1999 and their age is above 18 years then they have high chance to terminate the ART care. The study also showed certain hidden information that young people whose age is less than 18 years; have high chance of staying longer in ART care service. Patients terminate the service in shorter time at Asella hospital than at Adama hospital. Those who are jobless have high chance to stay in the care. The reason (s) for these hidden patterns is left open for future researches works. From comparisons done among the experimentations made, it was learned that those data mining techniques, which were experimented for this research are applicable on the ART dataset of the cases under investigation in general but generalized decision tree with pruning outperformed for classification purpose on the dataset in terms preciseness, providing general insight, Performances and accuracy measures with fair execution time and providing best interpretable patterns. Many association rules were obtained with minimum support of 30% and confidence 50% had provided optimum rules with acceptable patterns.
Application of Data Mining Techniques to Predict Antiretroviral Therapy Initation Time the Case of Adama and Ambo Hospitals, Oromia Regional State
(Addis Ababa University, 2013-10) Dejene, Getachew; Meshesha, Million (PhD)
Background: AIDS patients receive antiretroviral treatment (ART) which they need to take every day for the rest of their life. To maintain treatment efficacy, it is necessary to start the treatment at a suitable time. Although the debate regarding when to start antiretroviral therapy has been present for over two decades, consensus on this question has been hard to achieve. This lack of clarity continues in the current era, with major guidelines recommending very different treatment strategies. Objective: The purposes of this research are to assess the applicability of different data mining techniques to predict the initiation time for Antiretroviral Treatment (ART), to identify attributes that are associated with initiation time of ART and to develop a model that can be used to predict the initiation time for Antiretroviral Treatment (ART) using data obtained from Adama and Ambo ART clinic. Method: To undertake this study a hybrid Data mining process model has been employed. The study used 11,440 instances, ten predicting attributes and one outcome variables to run the experiments. Accordingly, Apriori algorithm is used to extract association rules while classification algorithms such as J48 decision tree, PART rule induction and Naïve Bayes were implemented to build predictive models. Result: Experimental result shows that the model developed using AdaBoostM1withpruned PART registers the highest accuracy of 95.62% as compared to Naïve Bayes and J48. The finding of the study clearly presents that Sex, age, OACD4, OAWHO Stage, Family planning and Occupation attributes are best predicts used to predict ART Initiation Time. Conclusion: The study comes up with a predictive model that assists practitioners to predict whether the pre-ART patients should start the treatment "immediately”, “Early” or "Delayed".
The Application of Information Retrieval Techniques to Amharic Documents on the Web
(Addis Ababa University, 2001-07) Amsalu, Saba; Teferi, Dereje (PhD); Meshesha, Million (PhD)
The World Wide Web is an escalating mass of interconnected data that stretches from computer to computer across the world. Information retrieval systems on the Web provide users with relevant information without human intervention, saving time, labor and money. The Web contains documents of diverse content in different languages. Making those documents accessible to users has become a difficult task with the fast growth of the Web. Hence developing information retrieval systems to cope with inherent features of Web data has been a research area of tile time in information science. In this study an attempt is made to explore the possibilities of applying some information retrieval techniques for Amharic documents on the Web. To back tile research, literature review on related works has been made. Different information retrieval techniques and algorithms used on other languages have been reviewed to determine the possibilities of applying them to Amharic documents on the Web. A database that stores Amharic Web page data, suffix list and index files has been designed. Web page submission form was developed to allow the submission of Web page data into the database. Designing an Amharic •query input interface was also part of the research. Automatic indexing and searching techniques have been applied on a collection of 313 Web pages of Amharic documents taken from Walta Information Center news publications. Word and stem inverted index options were explored. An Amharic search interface was then created to handle Amharic data on the Web using ColdFusion Studio and ColdFusion Server 4.0 on Windows NT 4.0 Operating System and Internet Information Server (liS). The searching algorithm that was implemented is Expended Boolean model, which is a Boolean model with a vector functionality that allowed to rank retrieved documents. To measure tile performance of the prototype system, retrieval experiments have been conducted for twenty-two queries and an average recall-precision graph is drawn. Using terms with suffixes and prefixes removed resulted in a better performance than using words Finally, conclusions are drawn based on the test results obtained and recommendations are made as 10 what further researches could be done for the development of Amharic information retrieval systems on the Web.
Application of Knowledge Based System for Woody Plant Species Identification
(Addis Ababa University, 2009-04) Alemu, Dejen; Meshesha, Million (PhD)
Finding the correct identity of trees is the beginning o f any inventory and management activities as well as any sallies regarding the tree s pieces. Identification o f plant species in Ethiopia is conducted only in the National Her barium . At present, t he cent re is no t supported by inform action systems, which makes the identification process and dissemination o f inform action inefficient and difficult. The need o f KB S for technical information transfer and efficacy in demonology ca n be identified by recognizing the problem s in using the current system for technical in formation transfer and by proving that KB S ca n help to overcome the problems addressed, and a re feasible to be developed . This study attempts to design proto type KB S for woody plan t species identify fiction. As compared to existing w ay of identification on we co me lip with new knowledge/rules with minimum features theater is torso parable performance. By Inglis is system , users cant access to ex pert knowledge an dwell beagle to identify woody plant species like taxonomists d o/judge. Using taxonomy c KB S in different forest try research centers, high-pa id taxonomy sets will reduce the costs o scientific research and w ill al low man y researchers to conduce t the research more in depend intently ( thought going to the National Her barium for identify cat ion) . This research h is conducted in a step-wise man ne r. Aft e r pro b le m se lection , know ledge acquisition n process is conducted. In this process, a key inform ant interview is he ld with experts ( two taxonomy sits and one reseal richer). I n addition to the key informant interview, manual s and books lased in woody plant spec resident fiction are al so consulted. The knowledge ex traced for m the ex pert s' and relevant documents that use s to solve a problem is modeled in hi e archival or laddering tech unique. Based on the final knowledge mode led in decks ion lade ring, domain knowledge is represented using p reduction rules in prolong to construct the know ledge base. The system is developed to load the know ledge base and starts to infer from the knowledge base based on the users input! facts. The prolog built in backward in fringe chains is used for the identification of the species. The user interface is redesign in vb. net. Fin ally, the system is tested and evaluated by the users. The res cults host at, the system identifies the woody plant species correctly and ca n be applicable in woody plant species e notification . Key words: knowledge bossed system, prolong , tree species identikit rebellion, knowledge acquisition, knowledge modeling, and KBS evaluation.
Application of Knowledge Based System for Woody Plant Species Identification
(Addis Ababa University, 2009-04) Alemu, Dejen; Meshesha, Million (PhD)
Finding the correct identity of trees is the beginning of any inventory and management activities as well as any studies regarding the tree species. Identification of plant species in Ethiopia is conducted only in the National Herbarium. At present, the centre is not supported by information systems, which makes the identification process and dissemination of information inefficient and difficult. The need of KBS for technical information transfer and efficacy in dendrology can be identified by recognizing the problems in using the current system for technical information transfer and by proving that KBS can help to overcome the problems addressed, and are feasible to be developed. This study attempts to design prototype KBS for woody plant species identification. As compared to existing way of identification we come up with new knowledge/rules with minimum features that registers comparable performance. By using this system, users can get access to expert knowledge and will be able to identify woody plant species like taxonomists do/judge. Using taxonomic KBS in different forestry research centers, high-paid taxonomists will reduce the costs of scientific research and will allow many researchers to conduct their research more independently (without going to the National Herbarium for identification). This research is conducted in a step-wise manner. After problem selection, knowledge acquisition process is conducted. In this process, a key informant interview is held with experts (two taxonomists and one researcher). In addition to the key informant interview,x manuals and books used in woody plant species identification are also consulted. The knowledge extracted from the experts’ and relevant documents that uses to solve a problem is modeled in hierarchical or laddering technique. Based on the final knowledge modeled in decision laddering, domain knowledge is represented using production rules in prolog to construct the knowledge base. The system is developed to load the knowledge base and starts to infer from the knowledge base based on the users input/ facts. The prolog built in backward inferring mechanism is used for the identification of the species. The user interface is designed in vb.net. Finally, the system is tested and evaluated by the users. The result shows that, the system identifies the woody plant species correctly and can be applicable in woody plant species identification. Key words: knowledge based system, prolog, tree species identification, knowledge acquisition, knowledge modeling, and KBS evaluation.
Applications of Expert Systems in Species Selection: The Case of Forestry Research Center. (FRC)
(Addis Ababa University, 2001-07) Abduselam, Samir; Bini, Tesfaye (PhD); Meshesha, Million (PhD)
The Forestry Research Center (FRC) has an objective of PROVIDING users with scientific knowledge by disseminating research output, so that the, production and productivity of tree I species improved and conservation activities are better managed. At the moment, FRC is experiencing difficulties in fulfilling the above-mentioned objective. Among the factors for experiencing such difficulties are the isolated research, poor information and communications technology (ICT) use, lack of enough and accessible experts in the field of species selection, and lack of systematic research compilation techniques. In an attempt to help FRC address such problems, particularly with respect to information dissemination, the present study, specifically, explores the potentiality of applying expert systems technology in species selection task. in particular, A Prototype Species Selection Expert System, SPEX, is developed by working closely with experts in the research center having years of expiree in an attempt to help FRC address such problems, particularly with respect to information dissemination, the present study, specifically, explores the potentiality of applying expert systems technology in species selection task. In particular, A Prototype Species Selection Expert System, SPEX, is developed by working closely with experts in the research center having years of experience and skills in the area of species selection. And skills in the area of species selection. The knowledge acquired from these experts is modeled using the Hierarchical structure that represents concepts and parameters involved in species selection. Based on the model, the knowledge is represented using production rules. These rules are then implemented in the knowledge-pro expert system shell. Backward chaining is used in inferring the rules and extracting recommendations. The Certainty Factor (CF) for each species is calculated on priority basis to measure the belief of the human expert on the selected species. SPEX has also been tested by the experts and users from Forestry Research Center (FRC). Based on an encouraging result/output obtained, they have established its acceptability and accuracy. The experts finally recoil/mended that a way should be devised to build a complete species selection expert system that include all tree species, which were researched in the center.
Automatic Amharic Text News Classification: A Neural Networks Approach
(Addis Ababa University, 2009-10) Kelemework, Worku; Meshesha, Million (PhD)
Text classification is one of the methods used to organize massively ail able textual information in a meaningful context to maximize utilization of information. Automatic text class fiction is the preferred method for accomplishing Classify at ion in large volumes of in formation. Research works on automatic classification is flourishing in the context of other languages; whereas, research on automatic Amharic text classy fiction is in its in fancy stage and very few attempts have been made till now. This study puts forward its own contribution for automatic Amharic text class fiction. Before the classifier is constructed, preprocessing has been done on the data to make it ready for the learning algorithm including changing various Amharic characters with the same sound to one common form; stemming word variants; and removing stop words, punctuation marks and numbers. And Document Frequency (OF) threshold is applied to select features of news items . Two weighting schemes, Term Frequency (TF) and Term Frequency by In verse Document Frequency (TF* IOF), are used so as to weight the features in news documents to construct news by features matrix, which is fed to the learning algorithm. This study considers one of the neural networks learning methods called Learning Vector Quantization (LVQ), to see its suitability for automatic Amharic text news classification. In the course of this study, it is found that TF weighting scheme outperforms TF* IDF weighting scheme by 3.54% on average. Using the TF weight method, 94.81 %, 61.61 % and 70.08% accuracies are obtained at three, six and nine cat ego rise pediments respectively with an average of 75.5% accuracy. For similar experiments, the application of TF*IOF weight method resulted in 69.63%, 78.22% and 68.03% ac curacies with an average of 71.96% accuracy. Previous research works on Amharic text c classification show that, accuracy decreases consistently with the increase in categories. The result of this study shows that accuracy does not depend on the number of news items and categories considered; rather, representing each category with enough number of subclasses determines accuracy. Therefore, further works focusing on finding the optimum number of subclasses is the major direction of research with regard to Amharic text news classification using LVQ.
Automatic Amharic Text News Classification: A Neural Networks Approach
(Addis Ababa University, 2009-09) Kelemework, Worku; Meshesha, Million (PhD)
Text classification is one of the methods used to organize massively available textual information in a meaningful context to maximize utilization of information. Automatic text classification is the preferred method for accomplishing classification in large volumes of information. Research works on automatic classification is flourishing in the context of other languages; whereas, research on automatic Amharic text classification is in its infancy stage and very few attempts have been made till now. This study puts forward its own contribution for automatic Amharic text classification. Before the classifier is constructed, preprocessing has been done on the data to make it ready for the learning algorithm including changing various Amharic characters with the same sound to one common form; stemming word variants; and removing stop words, punctuation marks and numbers. And Document Frequency (DF) threshold is applied to select features of news items. Two weighting schemes, Term Frequency (TF) and Term Frequency by Inverse Document Frequency (TF*IDF), are used so as to weight the features in news documents to construct news by features matrix, which is fed to the learning algorithm. This study considers one of the neural networks learning methods called Learning Vector Quantization (LVQ), to see its suitability for automatic Amharic text news classification. In the course of this study, it is found that TF weighting scheme outperforms TF*IDF weighting scheme by 3.54% on average. Using the TF weight method, 94.81%, 61.61% and 70.08% accuracies are obtained at three, six and nine categories experiments respectively with an average of 75.5% accuracy. For similar experiments, the application of TF*IDF weight method resulted in 69.63%, 78.22% and 68.03% accuracies with an average of 71.96% accuracy. Previous research works on Amharic text classification show that, accuracy decreases consistently with the increase in categories. The result of this study shows that accuracy does not depend on the number of news items and categories considered; rather, representing each category with enough number of subclasses determines accuracy. Therefore, further works focusing on finding the optimum number of subclasses is the major direction of research with regard to Amharic text news classification using LVQ.
An Automatic Sentence Parser for Oromo Language Using Supervised Learning Technique
(Addis Ababa University, 2002-06) Megersa, Diriba; Getachew, Mesfin (PhD); Meshesha, Million (PhD)
The goal of Informal ion Retrieval has been to reduce human language complexities and as a result serve users in The mos I efficient way. The decisive in achieving such end is the Natural language Processing (NLP). NLP has many components in serving such purpose. Parsing is one of such components in NLP in improving precision and calligraphic is The goal of Informal ion Retrieval Systems. Moreover, parsing is also used inhere{for warlords machine Translation which is one of the hear of Natural Language Processing. Today, difference kinds of parsers have been developed' languages. lhis hare relatively wider use nationally and/or international/ly since The 1960.1. Un[unalterably Gromo has nol captured Ihe advanlage of such .Iyslem being Ihe working language of Ihe Slale Government of Gromiya, and one of Ihe major languages in Elhiopia and Ababa (Abebe 2002) lor Ihere are no syslems (parsers of any sarI) Ihal parse wril/en lexlS in Ihis language. This siudy is, Iherefore, an allempl 10 develop a simple aulomalic .lenIence parser for Oromo language In Ihe sludy, Ihe chari algorilhm 11 '0.1 used lI'ilh some modi/iealion. A module (or mOlphological analyzer, which splils words inlo roOI form and Iheir wrresponding morpheme, was also developed in order 10 faeil ilale Ihe preparalion of lexls in a lile 10 be parsed wilh appropriale lexical calegories. In addition, The unsupervised learning algorilhm was designed 10 guide The parser in predicting unknown and ambiguous words in a sentence. Grammar rules, lexicon, morphological rules and lexicon in-formalin were also designed on The basis of Ihe review Decide on Ihe linguistic propellers of amII/o grumll1alical categories. This system, facing, is the firslinils kind fiJI' this language. The study adopts an intelligent (Rule-Based+ learning Inodule) approach to develop a prototype. which is a simple Drama parser/or the language. The thesis. in short. describes processes a/automated sentence parsing oj' Free Texts. That is, it is aimed at developing a prototype and conducting an experimel with it. The result obtained (95% on the training test and 885% on the test set) using the small manually parsed sentences encourage birther research to be launched. especially with the aim of developing fill~fledged Oromo sentence parser.
Automatic Thesaurus Construction for Amharic Text Retrieval
(Addis Ababa University, 2009-07) Mekonnen, Andargachew; Meshesha, Million (PhD)
Thesauri have been used for literary composition since their inception in 1852, but nowadays their primary use is for information retrieval. Even they are among the crucial components of retrieval systems which are typically used for enhancing indexing operations and query expansions during searching. Even though Amharic language has been a written language for a couple of centuries and huge volumes of Amharic electronic documents are accumulated, not much has been done towards the development of effective and efficient Amharic retrieval systems. In this research work much effort has been exerted to generate thesaurus automatically for text retrieval in order to help the development of an effective and efficient Amharic retrieval system. The development of the automatic thesaurus generation system is based on the WOROSPACE model. The WOROSPACE model is derived from the inverted file index by applying Random Projection algorithm for dimensionality reduction. Nearest Neighboring clustering algorithm is employed to generate thesaurus automatically from the WOROSPACE model constructed An encouraging result is obtained in the experimentation of the system on Amharic Bible documents. During experimentatIOn the accuracy of the automatically generated thesaurus is evaluated The result on a random sample of ten terms shows that the system has accuracy of 58%. To further investigate its applicability for Amharic information retrieval, the thesaurus is integrated to an IR system for query expansion. The retrieval system is tested with and without using thesaurus in order to show the improvement made 111 retrieval effectiveness. Performance analysis shows that the recall of the system while using thesaurus is superior to not using it. The average recall values are 73.34% and 3729% after and before using thesaurus for query expansion, respectively keywords Amharic Thesaurus , WORDS PACE, Information Retrieval (IR)
Automatic Thesaurus Construction for Amharic Text Retrieval
(Addis Ababa University, 2009-07) Mekonnen, Andargachew; Meshesha, Million (PhD)
Thesauri have been used for literary composition since their inception in 1852, but nowadays their primary use is for information retrieval. Even they are among the crucial components of retrieval systems which are typically used for enhancing indexing operations and query expansions during searching. Even though Amharic language has been a written language for a couple of centuries and huge volumes of Amharic electronic documents are accumulated, not much has been done towards the development of effective and efficient Amharic retrieval systems. In this research work much effort has been exerted to generate thesaurus automatically for text retrieval in order to help the development of an effective and efficient Amharic retrieval system. The development of the automatic thesaurus generation system is based on the WORDSPACE model. The WORDSPACE model is derived from the inverted file index by applying Random Projection algorithm for dimensionality reduction. Nearest Neighboring clustering algorithm is employed to generate thesaurus automatically from the WORDSPACE model constructed. An encouraging result is obtained in the experimentation of the system on Amharic Bible documents. During experimentation the accuracy of the automatically generated thesaurus is evaluated. The result on a random sample of ten terms shows that the system has accuracy of 58%. To further investigate its applicability for Amharic information retrieval, the thesaurus is integrated to an IR system for query expansion. The retrieval system is tested with and without using thesaurus in order to show the improvement made in retrieval effectiveness. Performance analysis shows that the recall of the system while using thesaurus is superior to not using it. The average recall values are 73.34% and 37.29% after and before using thesaurus for query expansion, respectively. Keywords: Amharic Thesaurus, WORDSPACE, Information Retrieval (IR)
Coffee Disease Detection using Convolutional Neural Network an Image Processing Approach
(Addis Ababa University, 2021-11-19) Taddese, Haymanot; Meshesha, Million (PhD)
Coffee is one of the most important products in Ethiopia. Coffee has a great contribution in Ethiopia economy since it increases foreign currency of the country; and is the source of daily income earning for farmers. Therefore, controlling coffee diseases and ensuring quality of coffee product is the major issue for the country. Currently disease identified manually by experts and they identify by eye, so this is makes challenged and expert’s not available in everywhere in production area and other researchers don’t see Cercospora leaf spot and coffee berry disease. The aim of this research is therefore detecting common coffee diseases using digital image processing and deep learning technique. In this study, we consider the most common coffee diseases such as Cercospora leaf spot, coffee phoma disease and coffee berry disease. Convolutional Neural Network has showed its efficiency and accuracy on image processing in representing images and creating patterns to identify coffee diseases. This research proposed Convolutional Neural Network technique to detect coffee leaf and coffee beans diseases. This study follows experimental research methodology. 552 coffee leaf and coffee beans images dataset captured by HD camera and Motorola Phone from popular coffee production areas of Ethiopia, such as Jimma (agaro) and Bonga (kefa) zone farm and 5334 coffee images collected from Jimma Agricultural Research Center (JARC) and Bonga Agricultural Research Center (BARC) database. We have used four-classes for classification; namely, Cercospora leaf spot, coffee phoma disease, coffee berry disease and Healthy coffee. The total number of data sets used for experimentation is 5886. From the total data sets, 80% is used for training and the remaining 20% for testing purpose. Experimental result shows that the proposed model detects the disease with 96.1 % accuracy. This is a promising result towards designing a model that can be used for automatic coffee disease detection. As a future work, we would like to recomendede the model to recognize other coffee parts stems and roots with large amount of images.
Design of a Drug Information Management System
(Addis Ababa University, 2017-07) Tesfaye, Micheal; Meshesha, Million (PhD); Deyessa, Nigussie (PhD)
Background - Drug information management system (DIMS) is a system that provides drugs information for the doctors, pharmacists and non-health people. The drug information includes drug categories, trade name, generic name, manufactured company, drug contents, indications, dosage, contradiction, special precaution, and drug adverse reaction and drug interactions. in addition to drug information the users will be able to set reminders on drug intake schedules, get information on events related to medications and get news related to the medication industry. Objective – This project attempts to develop a drug information management system on website platform and smartphone application. Methods– To develop the drug information management system, we have used object oriented system analysis and design methodology. UML Tools used for analysis, design phases and for the development of the system programming tool like CSS and PHP were embedded with HTML and MYSQL database language. Integration of development modules and testing was implemented to bring the functionality of the system. Results - System analysis was done by using functional, object and dynamic models. The functional model is described by use case diagrams, object model is described by class diagram, and the dynamic model is described by sequence diagrams. Both performance and usability tests were conducted to measure the performance of the system. These tests helps to describe to what extent the system is usable. The result of system usability testing shows 92% willingness to use the system as a tool to get drug related information. Conclusion: Generally, the prototype system serves as a drug information reference system. The initial feedback from health professionals and non-health people has been extremely positive. Hence the prototype system has achieved a good performance and meets the objectives of the project. However, in order to make the system applicable in the domain area, there is a need to include some adjustments like including more languages and price of the drugs in the current Ethiopian market.

Browsing by Author "Meshesha, Million (PhD)"

Results Per Page

Sort Options