Computer Science

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 363
  • Item
    Coreference Resolution for Amharic Text Using Bidirectional Encoder Representation from Transformer
    (Addis Ababa University, 3/4/2022) Bantie, Lingerew; Assabie, Yaregal (PhD)
    Coreference resolution is the process of finding an entity which is refers to the same entity in a text. In coreference resolution similar entities are mention. The task of coreference resolution is clustering all similar mentions in a text based on the index of a word. Coreference resolution is used for several Natural Language Processing (NLP) applications like machine translation, information extraction, name entity recognition, question answering and others to increase their effectiveness. In this work, we have proposed coreference resolution for Amharic text using bidirectional encoder representation from transformer (BERT). This method is a contextual language model that generates the semantic vectors dynamically according to the context of the words. The proposed system model has training and testing phase. The training phase includes preprocessing (cleaning, tokenization and sentence segmentation), word embedding, feature extraction Amharic vocabulary, entity and mention-pair and coref model. Like training phase, testing phase has its own step such as preprocessing (cleaning, tokenization and sentence segmentation) and coreference resolution as well as Amharic predicted mention. The use of word embedding in the proposed model is that it represent each word into a low dimension vector. It is a feature learning technique to obtain new features across domains for coreference resolution in Amharic text. Necessary informations are extracted from word embedding and processed data as well as Amharic characters. After we extract important features from training data we build a coreference model. Moreover, in the model bidirectional encoder representation from transformer is used to obtain basic features from embedding layer by extracting various information from both the left and right direction of the given word. To evaluate the proposed model, we conduct the experiment using Amharic dataset, which is prepared from various reliable sources for this study. The commonly used evaluation metrics for coreference resolution task are MUC, B3, CEAF-m, CEAF-e and BLANC. Experimental result demonstrate that the proposed model outperformed state-of-the-art Amharic model achieving 80%, 85.71%, 90.9%, 88.86% and 81.7% F-measure values respectively on the Amharic dataset.
  • Item
    Deep Learning Based Emotion Detection Model for Amharic Text
    (Addis Ababa University, 8/26/2021) Tesfu, Eyob; Belay, Ayalew (PhD)
    Emotions are so important that whenever we need to make a decision, we want to feel other‟s emotions. This is not only true for individuals but also for organizations. Due to the rapid growth of internet peoples expirees their emotions using different social media networks, reviews, blogs, online and so on. The need for finding relevant sources, extracts related sentences with emotion, summarizes them and organize them to useful form is becoming very high. Emotion detection can play an important role in satisfying these needs. The process of emotion detection involves categorizing emotional sentences into predefined categories such as sadness, anger, disgust, happiness, so on based on the emotional terms that appear within the comment. So that it‟s difficult to manually identifying emotion of a million of users and aggregating them towards a rapid and efficient decision is quite a challenging task due to the rapid growth of Amharic language usage in social media. In this research work, an emotion detection model is proposed for determining the emotion expressed in the Amharic texts or comment. In this study, we proposed deep learning based emotion detection model for Amharic text using CNN with word embedding. The proposed model includes different tasks. The first task is text pre-processing which consists of commonly used text pre-processing tasks in many natural language processing applications. We perform text pre-processing in Amharic text and train the document using a word embedding in order to generate word embedding model. The embedding result provides a contextually similar word for every word in the training set then we implement our CNN model for emotion classification. The common evaluation metrics such as accuracy, recall, F1 score and precision were used to measure our proposed model performance. Deep learning based emotion detection model for Amharic text prototype is developed and used to tests the system performance using the collected Amharic text comments. Finally, this study with four categories (sadness, anger, disgust, and happiness) of classification shows a result of 71.11% accuracy. Also did better when the number of classification is two (positive and negative) shows result of 87.46% accuracy. We also evaluate our model using RNN to compare with our CNN model.
  • Item
    Semantic Role Labeling for Amharic Text Using Deep Learning
    (Addis Ababa University, 8/17/2021) Meresa, Bemnet; Assabie, Yaregal (PhD)
    Semantic Role Labeling (SRL), the task of automatically finding the semantic roles of each argument corresponding to each predicate in a sentence, is one of the essential problems in the research field of Natural Language Processing (NLP). SRL is a shallow semantic analysis task, and an important intermediate step for many NLP applications, such as Question Answering, Machine Translation, Information Extraction and Text Summarization. Feature-based approaches to SRL are based on parsing output, often using lexical resources, and require heavy feature engineering. Errors encountered in the parsing output can also propagate to the SRL output. Neural-based SRL systems, in contrast, can learn the intermediate representations from raw text, bypassing the manual feature extraction task. Recent SRL studies using Deep Learning have shown improved performance over feature-based systems for the English, Chinese and other languages. Amharic exhibits typical Semitic behaviors that pose challenges to the SRL task, such as, rich morphology, and multiple subject-verb-object word orders. In this work, we approach the problem of SRL for the language using deep learning. The input is raw sentence with words represented using a concatenation of word, character, and fastText-level neural word embeddings to capture the morphological, syntactic and semantic information of the words in sentences, and requires no intermediate feature extraction tasks. We have used a bi-directional Recurrent Neural Network (RNN) with Long-Short Term Memory (LSTM) to capture the bi-directional (for argument identification) and long-range (for argument boundary identification), and a conditional random field with viterbi-decoding to implement the SRL system for the language. The system was trained on 8000 instances and tested on 2000 instances, and achieved an accuracy of 94.96% and F-score of 81.2%. We have manually annotated the sentences with their corresponding semantic roles, and future works can consider improving the quality of the data and experiment feature representations using contextual embeddings for improved performance.
  • Item
    Open Source ESB Based Application Integration Case of Ethiopian Revenue and Customs Authority
    (Addis Ababa University, 12/6/2016) Tesfaye, Mihret; Getahun, Fekade (PhD)
    Nowadays integration and interoperability becomes a key issue for organizations that work together. Enterprise Service Bus has become the ideal integration architecture for heterogeneous systems that facilitates integration between disparate applications with different hardware and software platforms. The aim of this work is to assess services provided by Ethiopian Revenue and Customs Authority for vehicle declaration those require integration and study the workflows of the existing system. This study provided an Enterprise Service Bus product evaluation matrix and four open source ESB products evaluated and the appropriate product for implementation selected. After discussing core ESB concepts, features and benefits, proprietary and open source ESB products are described briefly. The Enterprise Service Bus product evaluation matrix prepared by reviewing variety of research papers by different professionals and organizations and the products evaluated based on the matrix. Based on the result from the comparison, the WSO2 ESB is used for developing the integration scenario. The development of the scenario is done using WSO2 ESB and detail information on the design and development is included. Finally the design and implementation of the integration scenario is done using the selected ESB solution that is WSO2 ESB and the integrated system evaluated by performing functional testing. The result of the functional testing indicated a successful outcome for all the test sets.
  • Item
    Amharic Sentence Generation from Interlingua Representation
    (Addis Ababa University, 12/27/2016) Yitbarek, Kibrewossen; Assabie, Yaregal (PhD)
    Sentence generation is a part of Natural Language Generation (NLG) which is the process of deliberately constructing a natural language text in order to meet specified communicative goals. The major requirement of sentence generation in a natural language is providing full, clear, meaningful and grammatically correct sentence. A sentence can be generated from different possible sources, including a representation which does not depend in any human languages, which is an Interlingua. Generating a sentence from an Interlingua representation has numerous advantages. Since Interlingua representation is unambiguous, universal and independent of both the source language and the target language, the generation should be target language-specific, and likewise should be the analysis. Among the different Interlinguas’, Universal Networking Language (UNL) is commonly chosen in view of various advantages over the other ones. Various works have been done so far for different languages of the world to generate sentences from UNL expression but to the best of our knowledge there are no works done so far for Amharic language. In this thesis, we present Amharic sentence generator that automatically generates Amharic sentence from a given input UNL expression. The generator accepts a UNL expression as an input and parses to build a node-net from the input UNL expression. The parsed UNL expressions are stored in a data structure which could be easily modified in the successive processes. UNL-to-Amharic word dictionary is also prepared and it contains the root form of Amharic words. The Amharic equivalent root word and attributes of nodes in a parsed UNL expression will be fetched from the dictionary to update the head word and attributes of the corresponding node. Then, the translated Amharic root words will be locally reordered and marked based on the Amharic grammar rules. When the nodes are ready for generation of morphology, the proposed system makes use of Amharic morphology data sets to handle the generation of noun, adjective, pronoun, and verb morphology. Finally, the function words are inserted to the morphed words so that the output matches with a natural language sentence. The evaluation of the proposed system has been performed on dataset of 142 UNL expressions. Subjective tests like adequacy and fluency tests have been performed on the proposed system. Moreover, the quantitative test or error analysis has also been performed by calculating Word Error Rate (WER). From this analysis, it has been observed that the proposed system generates 71.4% sentences that are intelligible and 67.8% sentences that are faithful to the original UNL expression. Consequently, the system achieved a fluency score of 3.0 (on a 4-point scale) and adequacy score of 2.9 (on a 4-point scale). Furthermore, the proposed system has word error rate of 28.94%. These scores of the proposed system can be improved further by improving the rule base and lexicon.
  • Item
    Integrated Caching and Prefetching on Dynamic Replication to Reduce Access Latency for Distributed Systems
    (Addis Ababa University, 7/13/2021) Binalf, Yilkal; Libsie, Mulugeta (PhD)
    Distributed computing is a rapidly developing IT technology. Every system connects to other systems via the network to improve its performance. Thanks to distributed systems technology, workers from all over the world can collaborate to work for a single company, and customers of these companies can access data and receive service as if they were in the same location. However, as the number of users and organizations requesting and delivering these services grows, there is a problem with access latency. One of the major problems of distributed systems is response time latency. As a result, we developed the integrated Caching and Prefetching on Dynamic Replication (CPDR) algorithm, which reduces access latency in distributed computing environments. Cacher, Prefetcher, and Replicator are the three main components of the developed system. There is one more unique component in the cacher called Notifier, which has the Prefetcher's status and is used to save time when the prefetcher is not active and the requested data is not available. Furthermore, the Cacher, Prefetcher, and Replicator each has a manager component that contains an algorithm for controlling the Cache storage, prefetching data, replicating data, and determining where data should be placed. Moreover, taking various scenarios, which depict the minimum and maximum capacity of the computing environment as well as different requirements of incoming jobs, we evaluated our algorithm With caching, prefetching, dynamic replication, the integration of caching and prefetching, the integration of caching and dynamic replication, integration of prefetching and dynamic replication algorithms. It is observed that the proposed algorithm outperforms the counterparts from the perspective of response time and storage utilization
  • Item
    Automatic Soybean Quality Grading Using Image Processing and Supervised Learning Algorithms
    (Addis Ababa University, 10/12/2021) Hassen, Muhammed; Assabie, Yaregal (PhD)
    Soybean is one of the most important oilseed crops of the world which requires 25 to 30°C temperature for growth and proper modulation. Due to its high protein content and nutritional quality soybean usually used in food preparation, animal feed and industry sector. It is an input for food products like soy milk, for human consumption and as in put for industry for production like paper, plastic and cosmetics. The trading of soybean in Ethiopia is done through Ethiopian Commodity exchange internally as well as for export trading. Determining the quality grade of soybean is crucial in the trading process. It improves the production of quality soybeans and it helps to become competent in the market. This process is done manually in Ethiopian Commodity Exchange which is subjected to different problem: less efficient, inconsistent and vulnerable to subjectivity. As a solution in this thesis we propose an automated quality grading of soybean using image processing techniques and supervised learning algorithms, which is the aim of this thesis. Image acquisition, image pre-processing, image segmentation, predict soybean type and determining the grading are the major steps that are followed. For image preprocessing, methods like median filter to remove noise, modified unsharp masking sharpening technique is used to enhance the quality of acquired soybean image. In image segmentation a modified Otsu‟s threshold segmentation method is used to apply to a color image. Nineteen typical characteristic parameters of samples are extracted as the characteristic soybean, which are 7 morphological, 6 colors and 6 texture features. Three different supervised learning algorithm classifiers are applied and compared: support vector machine algorithm, artificial neural network and convolutional neural network. Experimental results show one dimensional convolutional neural network outperforms the others with accuracy rates of 93.71% on the test datasets collected from Ethiopian Commodity Exchange. We concluded that the CNN is superior to other supervised learning algorithm, and using aggregated features is better than using single type of features.
  • Item
    Automatic Fraud Detection Model from Customs Data in Ethiopian Revenues and Customs Authority
    (Addis Ababa, Ethiopia, 2013-03) Muhammed, Meriem; Hailemariam, Sebsibe(PhD)
    C\.Jstoms, which is one of the three wings in Ethiopian Revenues and Customs Authority (ERCA), is established to secure national revenues by controlling impons and exports as well as coll ecting go~emmental tax and duties. This research focuses on identification, modeling and analysis of various conflicting issues that Ethiopian customs faces. One of the major problems identified during problem understanding is controlling and management of fraudu lent behavior of fo reign traders. The declarams' intent to various types of fraudulent activities which result in the need for serious inspection of declarations and al the same lime, the huge amount of declarations per day demand significant number of human resource and time. Recognizing this critical problem of the government, ERCA adopt Automated System for Customs DAta (ASYCUDA). ASYCUDA attempts to minimize the problems through risk level recommendation to declarations using select ivi ty method that uses five parameters from the decJarants' information. The fundamenta l problem to ASYCUDA risk leveling is, restricting the variables whi ch are used to assign risk level; this may lead to direct the declaration into incorrect channel. This research proposed a machine learning approach to model fraudulent behavior of importers through identification of appropriate parameters from the observed data to improve the quality of service at Customs, ERCA. In this research, the researcher proposed automated fraud detection models which predict fraud behaviors of importing cargos, in which the problem assoc iated with ASYCUDA risk leveling wi ll be minimized. The models have been bui lt through machine learning techniques by using the past data which was collected from customs data of ERCA. The analysis has been done on inspected cargos records having 74,033 instances and 24 attributes. Four different prediction models were proposed. The first model is fraud prediction model, which predicts whether incoming cargo is fraudulent or not. The second model is fraud category prediction model, which identifies the specific type of the fraud category among the ten identified categories. The third model is fraud level prediction model. which class ifies the fraud level as high or low. The last model is fraud ri sk level prediction model which is used to classify the risk level of importing cargos into high. medium or low. x i • • Moreover. from the recommendation of IEEE, four best machine learning approaches have been tested for each of the identified prediction models. These are C4.5, CART. KNN and Naive Bayes. Based on the results which are obtained through various experimental analyses. C4.5 is found to be the best algorithm to build all types of the prediction models. The accuracy obtained in the first, second, third and founh scenarios using C4.5 machine learning algorithms are 93.4%,84.4%, 89.4%, and 86.8% respecti vely. The next best algorithm, Classification and Regression Tree (CART), performed an accuracy of 92.9%,80. 1 %,89.4%,85.3% for the first, second, third and fourth scenarios respectively. The researchers observed that both C4.5 and CART perform better for fraud prediction and fraud level classification compared to fraud category and risk level prediction. Moreover, Naive Bayes statistical approach is found to be very poor. Key words: Fraud prediction, fraud category prediction, fraud level prediction, fraud risk level prediction, classification, machine learning algorithm, ASYCUDA.
  • Item
    Design and Implementation of Afaan Oromo Spell Checker
    (Addis Ababa, Ethiopia, 2013-06) Olani, Gaddisa; Midekso, Dida (PhD)
    Developing language applications or localization of software ;s a resource intensive task Ihal requires the active participation of stakeholders with various backgrounds (i.e. from lingllislic and campl/llI/iollal perspecli\'e~~. With a cons/ani increase in /he amou1Ifs of electronic information and the diverJ'ily of languages which are used to produce them, these challenges get compounded. Various researches ;n the fields of computotionollinguis/ics and computer science have been carried oul while slill many more are on Iheir way 10 alleviate such problems. Spell checker is one pOlential candidate 10 this. Use 0/ compUlers for document preparation is one of those many ,asks undertaken by different organizations. Introducing texts /0 word processing tools may result in spelling errors. Hence, texl processing applica/ion software has !>pell checkers. Inlegraling spell checker in/o word processors reduces Ihe amount of time alld energy lpentto find and correct/he misl-peJ/ed lYord. However, these lools are not ami/able for A/aan Oromo language, Lowland East Cushitic sub-family of the Afro-asialic s/lper·phylum language family spoken in Ethiopia. In this thesis, we describe the design aml implementation 0/ AfaWl Oromo spell checker. Morphology ba~ed (i.e. dictionary look-up with morphological rule!>) computational model was employed to design and develop Afaan Oromo Spell Checker (AOSq. Algorithms Ihat take Ihe morphological properties of Afaan Oromo into consideration are developed/rom scratch and applied, as there are no previous such allemplS. The proposed system was evaluated using Iwo datasets 0/ different size. The experiment result shows thai the lexicon size and nIles in the knowledge base play a vital role to recognize the valid input word, flag Ihe invalid word and generate correct suggestion/or Ihe misspelled word. In general, the algorilhms and techniques used in this study oblained good performance when compared to the other resource-rich languages like English The resalt obtained encourages the IIndertaking of further research in the area, especially wil the aim of developing a full-fledged Afaan Oromo spell checker.
  • Item
    Protocol of a Systematic Mapping Study of Requirements Engineering Approaches for Big Data Project Development
    (Addis Ababa University, 2021) Regane, Belachew; Beecham, Sarah; Lemma, Dagmawi; Power, Norah
  • Item
    Flow-Based E-mail Spam Detection
    (Addis Ababa University, 2011-11) Hailu, Zelalem; Libsie Phd, Mulugeta
    The volume of unsol icited commercial e-mai ls, also known as spam, is in such a rapid increase that almost over 90% of all e-mail messages are spam. We are in a state where an average of200 bill ion e-mail spamsare sent eachday. This problem is exacerbated by the fact that many of these spams contain some sort of malicious code for attack. In addition to wasting of users' time and attack threats, the huge amount of spam also consumes bandwidth and storage spaces illegally. There have been efforts over the years to combat spam messages. The most popular ones arc based on e-mail content analysis and IP address reputation. Techniques based on e-mail content analysis arc fall ing behind because of spammers' ability to trick such filters using legitimate e-mail-like words in their contents. The introduction of image and PDF spams is also another headache for content based filters. Fi lters based on IP add ress reputat ion are also not coping well with the spammers because of the dynamic nature of II) addresses and the difficulty of hunting down malicious addresses before significant damages are donc. Our approach is to filter out spam messages before they are delivered to the user's inbox based on packet flow characteristi cs. This is a complimentary approach that can be used with other techniques to reduce the number of spam messages reaching users' inbox. Our approach is based on over 55,000 packet flow records. We have identified nine features that best different iate spam from legitimate e-mail. Based on these attributes and a classification model with an accuracy of 99.5% and a fal se-positive of 2.6%, we have developed a ranking algorithm that scores a given flow into one of five categories. Based on these scores, a given packet flow will be accepted, rejected or will be passed for further examination by other tech niques. In addition to giving the advantage of not rel ying on e-mail content or IP address to filter spam, our method also avoids the wastage of resources like bandwidth and storage space by spam messages.
  • Item
    Implementing Enhanced AODV Protocol to Prevent Black hole Attack in Mobile Ad hoc Networks
    (Addis Ababa University, 2011-11) Gebremeskel, Solomon; Ejigu PhD, Dejene
    Mobile ad-hoc networks are sel f-configuring networks of mobi le devices that can be establi shed wi thout a need for a network infrastructure. The faci that mobile ad-hoc networks lack central administrat ion and use wireless link for communication makes them very susceptible to various types of adversary's malicious attacks. Black hole attack is one of the severe security threats in ad-hoc networks that can be eas ily performed by exploi ting vulnerability of on-demand routi ng protocols s lIch as AODV. We implemented a solution to prevent black hole attac ks imposed by both single and multiple black hole nodes. Intrusion Detection lIsing Anomaly Detection (lOAD), works based on a general principle of Intrusion Detection Systems (IDS). It means an lOAD system identifies anomaly activities of an adversa ry from normalcy act ivities of nonmalic ious mobile nodes. The identification process in volves comparing communication attributes of each mobile node part icipating in a given ad-hoc network. The most distinguishing characteristic of lOAD is that it works in no-peer-t rust principle. Unlike the ex isti ng black hole attack prevention techniques that rely on the cooperation of mobile nodes to announce a presence of intrusion, lOAD enab les each mobile node to protect itself from an intruder. Implcmentation of the prevention mechanism has been carried out by usin g the Network Simulation version 2 (NS2). A Java Parser program and Tracegraph has also been used to analyze results of simulation. Post-analysis result proves the preven tion method implemcnted maxi mi zes network performance by effectively preventing black hole attacks again st mobi le ad-hoc networks as wetl us mini mi zing generation of control (routing) pac kets.
  • Item
    Web Element Locator Algorithm for Dynamic Web Application Testing Using Machine Learning
    (Addis Ababa University, 8/10/2021) Bayew, Mikiy; Kifle, Mesfin (PhD)
    Software testing is one of the software development life cycle for detect or discovers errors of the software. It ensures the correctness, completeness, and quality of software that has been developed. Automated software testing is a good strategy for reducing software testing effort during web application testing. During web applications, testers develop a web element locator to find web elements on a page using a query through test scripts. Developers apply changes to their web applications to meet new requirements, adding new functionalities, fixing bugs, etc. During those times web elements attribute like identifier (ID), name and classes of the web application dynamic and keeps changing. A simple modification of the application programming interface (API) affects locators, which leads to unable to select the desired web elements on the web application that may cause a test to fail. The main reason for the appearance of test case breakage is the failure of the element’s locator on the dynamic web application (DWA). It disturbs those who use test automation so much. To repair these fragility problems, test engineers must debug and rewrite those test cases. Because the locators that are used to select the web element may no longer be valid in the updated version. This process takes additional time for testing dynamic web applications. In this research, to improve the accuracy and performance of DWA testing, we proposed a web element locator algorithm (WELA) using machine learning which automatically identifies web elements. The algorithm covers structural, logical, and presentation types of changes of web elements that may cause the breakage of the web element locator. It can identify a similar pattern based on web element features to adjust the locator according to the change. This makes the test more reliable and maintainable, by reducing the time and effort required to maintain web element locators. The testing process begins after the correct web elements have been identified. An experiment is performed in ten web applications to prove the effectiveness of WELA in terms of accuracy and performance. The result is promising, which shows the proposed approach effectively repairs 97% of broken web test scripts and generates the test with the relatively shortest execution time on the evolved versions of a DWA.
  • Item
    Design and Implementation of a Web Based E-Iearning Support System (For Technical and Vocational Education and Training (TVET)
    (Addis Ababa, Ethiopia, 2013-08) Hailemariam, Dawit; Midekso, Dida (PhD)
    Outcome based TvET training is important in laday's development of the economic sector. Young people and technologies should be rcady for a country to fit in the dynamically complex learning/training and working envi ronment. The case of Ethiopia is no exception, The country has been in a steady growth in TYET train ing since its implementat ion in 2001 /02. Compared 10 other deve loped co untries TYE'!' training in Ethiopia has not yel rcached the required level of developmen t. This impl ies more work should be carried oul 10 Ill eet present and future needs of ski lled and well tra ined man power as per the need or the variolls training and education fields and the req uirement of the labor market. The purpose of this work was to conduct a sllldy on design and im plementation or a web based eLearn ing support system ror TVET training programs. System requirements have been rigorously collected rrom the fi ve TVET institutions in Addis Ababa; and consulting related literatures and soft ware products which are used in other countries has been made. The design and implementat ion or a prototype system was done in accordance with the ident ified runctional and non runctional req uirements. The prototype has been tested with data from the li ve TVET institutes in Addis Ababa. II has been fOllnd out that the system enables the trainers and tra inees to register, to download proper learn ing material s, to obtain recent in formation about the TVET training, to see result (progress chart) ror a unit or competency, to send noti fi cation by mobile, to download and submit ass ignment, to download teachers guide, curriculum, TVET strategy documents, and generate various reports.
  • Item
    Context Aware Pervasive Healthcare System for HIV/AIDS Patients (CAPHS)
    (Addis Ababa,University, 2011-11) Mequanint, Alemitu; Ejigu, Dejene (PhD)
    These days, hea hhcare services are enjoying the applicat ion of pervasive computing systems. Advances in wireless tec hnologies- such as intelli gent mobile devices and wearable networks can improve communication among patients. physicians, and other healthcare workers as well as enable the delivery of accurate medical information anytime anywhere, thereby reducing errors and improving access. However, transmiss ion of vital signs, frequency of tran smi ssion of vital sign s, network communication cost, context refinement, and management of large contexts are still problems of pervasive hcalthcare systems. In this work, we propose a co ntext aware pervasive healthcare system for HIV!AIDS (I-Iuman Immunode ficiency Virus! Acq uired Immunodeficiency Syndrome) patients (CAPHS). The architecture of the system consists of three constituents: the patient unit, the healthcare unit and the doctor/nurse un it. The pati ent unit consists ofa group of body sensors for detectin g vital signs data from patients and transmitting it to the patient's smart phone via Bluetooth. After preprocessing, the smart phone transmits vi tal signs context information to the heahhcare un it via Internet for further processing and reason ing if there is any abnorma lity. This preprocessing helps to determine the frequency of transmi ssion of vital signs and reduces the network communicat ion cost to transmit vital sign data. At the same tim e, it a lso increases the effic iency of the healthcarc unit because this operation reduces its task of context refinement. The hea lthcare unit is rich in ART (Antiretroviral Thera py) ontology knowledge that we developed. For better management of large contexts, we usc a hybrid context management approach in which the high level schema ontology stored in OWLlRDF (Web Ontology La nguage/Resource Description Framework) format and the ontology in stances stored in ordinal)' relational databa se. The doctor/nurse unit is the thinnest of all units in terms of number of components. It communicates with the health care unit via InternetiSMS so that physicians can remotely monitor their pat ient. We have developed a prototype implementation for the system and we see that it is promising.
  • Item
    Automatic Construction of Amharic Semantic Networks (ASNet)
    (Addis Ababa,University, 2013-03) Tefera, Alclgn; Assabie, Yaregal (PhD)
    Semantic networks are becoming popular issues these days. Even though this popularity is mostly related to the idea of semantic web, it is also related to the natural language applications. Semantic networks allow search engines to search nOI only for the key words given by the user but also for the related concepts. and show how this relation is made. Knowledge stored as semantic networks can he used by programs that generate text from structured data. Semantic networks are also used for document summarization by compressing the data semantically and document classification using the knowledge stored in it. As a result, semantic networks have become key components in many NLP applications. In this thesis, we focused on the construction of semantic networks for Amharic text. We have developed Amharic WordNet as initial knowledge base for the system and extracted intervening word patterns between pairs of concepts in the WordNet for a specific relation from free text. For each pair of concepts which we know the relationship contained in Amharic WordNet, we search the corpus for some text snapshot between these concepts. The returned text snapshot is processed to extract all the patterns having n·gram words between the two concepts. We have used the WordS pace model for extraction of semantically related concepts. The process of relation identification in among these concepts utilizes the extracted text patterns. "Part·of' and "type·of' relations are very popular and frequently found between concepts in any corpus. We have designed our system to extract "part·of' and "type·of' relations between concepts. The system was tested in three different phases with different datasets from Ethiopian News Agency and Walta Information Center. The accuracy of the system to extract pairs of concepts having "type·of' and "part-of' relations is 68.5% and 71.7% respectively.
  • Item
    Concept-based Amharic Documents Similarity (CADS) Measure
    (Addis Ababa,University, 2013-12) Abera, Addisalem; Getahun, Fekade (PhD)
    Similarity measure has significance in the area of NLP applications such as search engme, in format ion ex traction and document classification. These LP applications are implemented in Amharic language. However, most of them rely on simple matching techniques or probabil istic method to measure si mil arity. These approaches do not always accurately capture conceptual relatedness as measured by humans. Some of the researches try to consider semantic nature of a document without handling ambiguity of words. In this research, we proposed Concept-based Amharic Document Simi larity (CADS) by buildin g AmhWordNel. The objective of this research is to implement effect ive similarity measure of documents by considering issues like pol yscmy, synonymy and semantic relationship between words. The main components of the proposed system (CADS) are AmhWordNet and Concept-based Simil arity Measure (CSM). CSM consists of Word Sense Disambiguation (WSD), Concept Trec Extraction and Semantic Similarity Measure modul es. The Amh WordNet is used as input during concept tree extraction and to implement WSD modul e. The extracted concept tree together with WSD module helps to lind the semantic similarity between words. The output of word similarity is used to compute se ntence similarity. Finally document similarity is computed based on sentence similarities. The performance of CADS is evaluated using prec ision, recall and F-measure evaluation metri cs. CADS without WSD (CADS WoWS D), Pointwise Mutual Information (PMI), Jaccard and Cosine similarity measures are implemented so that comparison between the fi ve systcms is done. According to the result we get from the experimcnt we conducted, the proposed system has better performance than the existing ones.
  • Item
    Development of Unsupervised Telecom Customers Clustering Model using Customer Detail Records (CDR) the Case of Ethiotelecom
    (Addis Ababa University, 3/15/2017) Banchalem Abebaw; Libsie, Mulugeta (PhD)
    IPv6 is the new version of Internet protocol developed by IETF. This new version is developed especially to resolve some issues of IPv4 like scalability, performance and reliability. Although new version is ready for usage, it is obvious that it will take years to transit fully from IPv4 to IPv6. We have to use these two versions together for a long time. Therefore, we have to investigate and know the transition mechanisms that we can use during the transition period to achieve a transition with minimum complication. ethio telecom has an IPv4 only enabled core network. For any LAN segment in Ethiopia to have an IPv6 enabled service over ethio telecom’s existing core network, a mechanism should be chosen and applied to avail IPv6 services. This thesis analyzes IPv6 Provider Edge (6PE) as an IPv4 to IPv6 transition mechanism. Here an attempt has been made to evaluate the suitability of the mechanism to the current ethio telecom core network. Comparisons are also done with other introductory mechanisms. Configurations and designing logical scenarios are done to demonstrate the 6PE. Graphical Network Simulator (GNS3) is used as a tool to simulate the scenarios. The route learning as well as route distributions of the IPv6 LANs is evaluated as a result of the tests done with GNS. The outcomes of the thesis are important for providing an insight for choosing an appropriate mechanism for ethio telecom and an idea about network capacity planning and step by step migration to IPv6.
  • Item
    Hate Speech Detection Framework from Social Media Content the Case of Afaan Oromoo Language
    (Addis Ababa University, 12/2/2021) Guta, Lata; Gizaw, Solomon (PhD)
    Hate Speech on social media has unfortunately become a common occurrence in the Ethiopia online community largely due to advances in mobile computing and the Internet. The connectivity and availability of social media platforms in the world allow people to Interact and interchange experiences easily. However, the anonymity and flexibility afforded by the Internet have made it easy for users to communicate aggressively. Hate Speech affects the society in many aspects, such as affecting the mental health of targeted audiences, affects social interaction, leads to violence and distraction of properties. Identifying a text that containing Hate Speech regularly is difficult task for humans, it is tedious and time consuming. To solve the newly emerged Hate Speech propagation in social media sites, recent studies employed different Machine learning algorithms and feature engineering techniques to detect Hate Speech messages automatically. In case of Afaan Oromoo language there is a work on Sentiment Analysis of Afaan Oromo using Machine learning Approach.but it is not in case of Hate and neutral classification rather oponions. In this research, a new Afaan Oromoo Hate Speech dataset from Facebook social media that are labeled into binary classes. TF-IDF, N-gram and word2ve feature are used as a feature for the Machine learning models. We evaluate the models using 80% for training and 20% for testing purpose by using train-test split with accuracy, precession, recall, and f1-score performance metrics were used to compare the models. The model based on LSVM with TF-IDF combination with N-gram achieves slightly better performance than the other models. Support Vector Machine(SVM) algorithm achieve the highest accuracy of 96% which is promised result.
  • Item
    Integrated Infomlation Architecture in Support of Road Safety Organizations; The Case of Ethiopia
    (Addis Ababa, 2013)
    Integrated Information Arc hitecture in Support of Road Safety Organizations: The Case of Ethiopia