Computer Science
Permanent URI for this collection
Browse
Browsing Computer Science by Title
Now showing 1 - 20 of 380
Results Per Page
Sort Options
Item A Framework for Detecting Multiple Cyberattacks in IoT Environment(Addis Ababa University, 2025-02-25) Yonas Mekonnen; Mesfin Kifle (PhD)The Internet of Things refers to the growing trend of embedding ubiquitous and pervasive computing capabilities through sensor networks and internet connectivity. The growth and expansion of newly evolved cyberattacks, network patterns and heterogeneous nature of cyberattacks trend has become the warfare across the globe and challenges to apply single layer cyberattacks detection techniques to the Internet of Things. This research work identified the lack of cyberattacks detection framework as the major gap for detection of multiple cyberattacks such as denial of services, distributed denial of services, and Mairi attacks while it includes multiple parameters at the same time. The proposed framework contains three modules; data acquisition and preprocessing module that is responsible for capturing and pre-processing the captured data and ready for the construction of the model, then the attack detection module which is the core engine that orchestrates the detection of cyberattacks, the third module notifies and displays the results in a dashboard. This research study used multiple parameters including multiple attack classes, network packet patterns, and three scaler types namely no scaler, MinMax, and Standard, and regardless of the defined parameters used, minmax scaler followed by standard scaler gives better detection performance than models trained with no scaler. The proposed framework is trained and evaluated with different models including CNN, Hybrid, FFNN, and LSTM provides a result of 91.42%, 82.75%, and 78.38% ,74.83% detection accuracy respectively where it is observed that CNN model outperforms the optimal results among followed by hybrid and FFNN.Item A Hybrid Deep Learning-Based ARP Attack Detection and Classification Method(Addis Ababa University, 2023-12) Yeshareg Muluneh; Solomon GizawTo map the Internet Protocol (IP) addresses to the Media Access Control (MAC) addresses and vice versa in local area network communication, the Address Resolution Protocol (ARP) is the most crucial protocol. ARP, however, is an unauthenticated protocol that lacks security features and is stateless in nature. Therefore, ARP is vulnerable to many attacks, and it can be easily exploited to gain unauthorized access to one's sensitive data and transmit bogus ARP messages to poison the ARP caches of the hosts within the local area network. These attacks may result in a loss of data integrity, confidentiality, and the availability of an organization's information. Many researchers have struggled to detect ARP attacks using different methods. However, some of these papers are not time-effective, require more human effort and involvement, and have high communication overhead. The other works use machine learning and deep learning methods, which have better solutions for detecting ARP attacks. However, those approaches have a significant false alarm rate of 13%, a low attack detection rate, and a classification accuracy of 87%. This thesis work aims to solve those problems using a hybrid deep learning-based ARP attack detection and classification method. In this work, we used a Sparse Autoencoder for important feature extraction and dimensionality reduction for input data and a Convolutional Neural Network for attack detection and classification to achieve the highest attack detection rate and classification accuracy with a minimized false alarm rate. To evaluate the performance of the proposed model, we used an open-source benchmark NSL-KDD dataset for training and testing. The results obtained by the implementation and evaluation are measured in comparison with a single Convolutional Neural Network model with different evaluation metrics. Hence, the proposed approach scores the highest results for attack detection rate of 98.97%, classification accuracy of 99.26%, and minimum false alarm rate of 0.74%.Item A Model for Recognition and Detection of the Counterfeit of Ethiopian Banknotes using Transfer Learning(Addis Ababa University, 2024-06) Hailemikael Tesfaw; Ayalew BelayPaper currency recognition systems play a pivotal role in various sectors, including banking, retail, and automated teller machines (ATMs). This paper presents a novel approach to the design and development of a paper currency recognition system using customized deep learning techniques. The proposed system utilizes image-processing algorithms to extract features from currency images, followed by customized convolutional neural network models for classification and detection of the counterfeit. The system is trained on a diverse dataset of currency images to ensure robustness and accuracy in recognizing various denominations and currencies. We implemented feature learning techniques architectures. To obtain the best accuracy and efficiency we used RLUs and Softmax as an activation, Adam optimizer, and sparse categorical cross-entropy as a loss function for both as a training strategy. The data was collected from the National Bank of Ethiopia, Commercial Bank of Ethiopia, NIB International Bank, and Bank of Abyssinia. From the experimental results of the alex_customed-design network, 99.82% accuracy is recorded.Item Accessing Databases Using Amharic Natural Language(Addis Ababa University, 10/6/2020) Legesse, Beniyam; Assabie, Yaregal (PhD)Nowadays, day to day activities of human beings is highly dependent on the information distributed in every part of the world. One major source of the information, which is the collection of related data, is the database. To extract the information from the database, it is required to formulate a structured query language which is understood by the database engine. The SQL query is not known by everyone in the world as it requires studying and remembering its syntax and semantics. Only professionals who study the SQL can formulate the query to access the database. On the other hand, human beings communicate with each other using natural language. It would be easier to access the content of the database using that natural language which in turn contributes to the field of natural language interface to the database. Since in many private and public organizations, peoples are performing day to day activities in Amharic language and many of them are not skilled in formulating structured query language, it would be better if there is a mechanism by which the users can directly extract information from the database using the Amharic language. This research accepts questions that are written in Amharic natural language and converts to its equivalent structured query language. A dataset which consists of an input word that is tagged with the appropriate output variable is prepared. Features which represent the Amharic questions are identified and given to the classifier for training purpose. Stemmer, Morphological analyzer, and pre-processor prepare the input question in the format required by the classifier. To identify appropriate query elements, Query Element Identifier uses the dictionary which is prepared by applying the concept of semantic free grammar. The query constructor constructs the required SQL query using these identified query elements. A prototype called Amharic Database querying system is developed to demonstrate the idea raised by this research. Testers from different departments having different mother tongue language test the performance of the system.Item Afaan Oromo Automatic News Text Summarizer Based on Sentence Selection Function(Addis Ababa University, 2013-11) Berhanu, Fiseha; Hailemariam, Sebsibe (PhD)The existence of the World Wide Web and advancement in digital device has caused an information explosion. Readers are overloaded with lengthy text where a shorter version would suffice. This abundance of information needs efficient tools to handle. Automatic text summarizer is one of the various tools used for the purpose of shortening lengthy documents, and alleviating the type of problem. This work focuses on developing efficient extractive Afaan Oromo automatic news text summarizer, through systematic integration of features: sentence position, keyword frequency, cue phrase, sentence length handler, occurrence of numbers and events like: - time, date and month in sentences. The data that aids for the system development are like: abbreviation, synonym, stop word, suffix, numbers, and name of: (time, date and month) collected from both secondary and primary sources. In addition, 350 English cue phrases are collected and translated to 729 Afaan Oromo cue phrases. For validation and testing 33 different newspaper topics are collected, of these, 20 of them have been used for validation while the rest 13 employed for testing purpose. The Total numbers of respondents who have participated in the validation ad testing data corpus preparation are 110. Besides, Open text summarizer C# version open source has been selected as a tool to develop the system The system has been evaluated based on seven experimental scenarios and evaluation is made both subjectively and objectively. The subjective evaluation focuses on evaluation of the structure of the summary like referential integrity and non-redundancy, coherence and informativeness of the summary. The objective evaluation uses metrics like precision, recall and F-measure for evaluation. The result of subjective evaluation is 88% informativeness, 75% referential integrity and non-redundancy, and 68% coherence. Because of the added features, different techniques and experiment applied to this work the system gave 87.47%fm and outperform by 26.95% than the previous work. Keywords: Afaan Oromo, Automatic news text summarizer, Cue Phrase, Sentence Selection FunctionItem Afaan Oromo List, Definition and Description Question Answering System(Addis Ababa University, 4/14/2016) Fita, Chaltu; Midekso, Dida (PhD)nformation is very important in our day to day activity. Technology plays an important role in order to satisfy human beings information need through the use of Internet where people ask questions and a system provides an answer for their query. For instance, search engines a user submit a query and the search engine displays a link to relevant web pages for each issued users query. The QA systems emerge as best solution to get the required information to the user with the help of information extraction techniques. QAS has been developed for English, Amharic, Afaan Oromo and other languages. The Afaan Oromo QAS is developed for answering factoid type questions where the answer is named entity. In this thesis, QAS is developed for answering list, definition and description question which deals with more complex information need. Document preprocessing, question analysis, document selection and answer extraction are the components used for developing the QAS. Tokenization, case normalization, short word expansion, stop word removal, stemming, lemmatization and indexing are the tasks of pre-processing. Question classification is done using a rule based approach. The subcomponents in document selection are document retrieval used for retrieving relevant documents and document analysis used for filtering the retrieved documents. The answer extraction component have sentence tokenizer for tokenizing sentences retrieved from the document analysis and independent subcomponents for definition-description and list were used, DDAE contains sentence extractor for extracting sentences from sentence tokenizer, the answer selection algorithm selects top 6 sentences from the scored and ranked sentences and finally sentence ordering algorithm order the sentences. The LAE contain candidate answer extraction for extracting through rules and gazetters and select answer. The system is tested using evaluation metrics. We used percentage ratio for evaluating question classification which classified 98% correctly. The performance of document selection and answer extraction is tested using precision, recall and F- score. Document selection component is tested and scored an F-score of 0.767. Finally, the answer extraction component is evaluated with an average F-score of 0.653. Keywords: Afaan Oromo List, Definitional and Descriptional Question Answering, Rule Based Question Classification, Document Filtering, Sentence Extraction, Answer SelectionItem Afaan Oromo Named Entity Recognition Using Hybrid Approach(Addis Ababa University, 2015-03) Sani, Abdi; Midekso, Dida (PhD)Named Entity Recognition and Classification (NERC) is an essential and challenging task in Natural Language Processing (NLP), particularly for resource scarce language like Afaan Oromo(AO). It seeks to classify words which represent names in text into predefined categories like person name, location, organization, date, time etc.Thus, this paper deals with some attempts in this direction. Mostly researcher have applied Machine Learning for Afaan Oromo Named Entity Recognition(AONER) while no researchers have used hand crafted rules and hybrid approach for Named Entity Recognition(NER) task. This thesis work deals with AONER System using hybrid approach, which contains machine learning(ML) and rule based components. The rule based component has parsing, filtering, grammar rules, whitelist gazetteers, blacklist gazetteers and exact matching components. The ML component has ML model and classifier components. We used General Architecture for Text Engineering (GATE) developer tool for rule based component and Weka in ML part. By using algorithms and rules we developed, we have identified Named Entity (NE) from Afaan Oromo texts, like name of persons, organizations, location, miscellaneous.Feature selection and rules are important factor in recognition of Afaan Oromo Name Entity (AONE). Various rules have been developed like prefix rule, suffix rule, clue word rule, context rule, first name and last name rule. We have used AONER corpus of size 27588, which is developed by Mandefro [1].From this corpus we have used corpus of size 23000 for training and 4588 for testing of our work. And we havean average result of 84.12% Precision, 81.21% Recall and 82.52% F-Score. Keywords: Named Entity Recognition, Named Entities, GATE Developer, Weka, Afaan OromoItem Afaan Oromo Named Entity Recognition Using Neural Word Embeddings(Addis Ababa University, 10/26/2020) Kasu, Mekonini; Assabie, Yaregal (PhD)Named Entity Recognition (NER) is one of the canonical examples of sequence tagging that assigns a named entity label to each of a sequence of words. This task is important for a wide range of downstream applications in natural languages processing. Two attempts have been conducted for Afaan Oromo NER that automatically identifies and classifies the proper names in text into predefined semantic types like a person, location, and organizations and miscellaneous. However, their work heavily relied on hand design feature. We proposed a deep neural network architecture for Afaan Oromo Named Entity Recognition, based on context encoder and decoder models using Bi-directional Long Short Term Memory and Conditional Random Fields respectively. In the proposed approach, initially, we generated neural word embeddings automatically using skip-gram with negative subsampling from an unsupervised corpus size of 50,284KB. The generated word embeddings represent words in semantic vectors which are further used as an input feature for encoder and decoder model. Likewise, character level representation is generated automatically using BiLSTM from the supervised corpus size of 768KB. Because of the use of character level representation, the proposed model is robust for the out-of-vocabulary words. In this study, we manually prepared annotated dataset size of 768KB for Afaan Oromo Named Entity Recognition. We split this dataset into 80% for training, 5% for testing and 15% for validation. We prepared totally 12,963 named entities from these 10,370.4 %, 648.15% and 1,944.45% are used for training, validation and test set respectively. Experimental results show that the combination of BiLSTM-CRF algorithms with pre-trained word embedding and character level representation and regularization techniques (dropout) perform better as compared to the other models such as Bi-LSTM, BiLSTM-CRF with only character level representation or word embeddings. Using Bi-LSTM-CRF model with pre-trained word embeddings and character level representation significantly improved Afaan Oromo Named Entity Recognition with an average of 93.26 % F-Score and 98.87 % accuracy.Item Afaan Oromo Search Engine(Addis Ababa University, 2010-11) Guta, Tesfaye; Midekso, Dida (PhD)The Web is a repository of huge amount of information among other sources of information used in the day-to-day activities of human being. Moreover, this information may be presented in different languages. Retrieving information from the Web requires the presence of search engines. There are general purpose search engines like Google, Yahoo, and MSN. These general purpose search engines are mainly designed for English language. Shortcomings of these search engines are reflected when they are applied to non-English languages such as Afaan Oromo as they lack specific characteristics of such languages. This research work came up with design and prototype of a search engine for Afaan Oromo texts. The search engine mainly consists of three components – crawler, indexer, and query engine that are optimized for Afaan Oromo. The crawler downloads documents and then filtering of these documents for Afaan Oromo is done by the categorizer subcomponent of the crawler. Next, documents that are identified as Afaan Oromo are preprocessed and stored in an index for later retrieval. Finally, queries supplied in an interface to the query engine component are preprocessed, checked for a match in the index, and matched documents are displayed through an interface in a ranked order. Performance evaluation of the search engine is conducted using selected set of documents and queries. According to precision-recall measures employed, 76% precision on the top 10 results and an average precision of 93% are obtained. Experiment on some specific features of the language against the design requirements is also made. Key words: Information Retrieval, Search Engine, Categorizer, Afaan OromoItem Afaan Oromo Text Summarization Using Word Embedding(Addis Ababa University, 11/4/2020) Tashoma, Lamesa; Assabie, Yaregal (PhD)Nowadays we are overloaded by information as technology is growing. This causes a problem to identify which information is reading worthy or not. To solve this problem, Automatic Text Summarization has emerged. It is a computer program that summarizes text by removing redundant information from the input text and produces a shorter non-redundant output text. This study deals with development of a generic automatic text summarizer for Afaan Oromo text using word embedding. Language specific lexicons like stop words and stemmer are used to develop the summarizer. A graph-based PageRank is used to select the summary of worthy sentences out of the document. To measure the similarities between sentences cosine similarity is used. The data used in this work was collected from both secondary and primary sources. Afaan Oromo stop word list, suffix and other language specific lexicons are gathered from previous works done on Afaan Oromo. To develop a Word2Vec model we have gathered different Afaan Oromo texts from different sources like: Internet, organizations and individuals. For validation and testing 22 different newspaper topics are collected, from this, 13 of them have been used for validation while the rest 9 were employed for testing purpose. The system has been evaluated based on three experimental scenarios and evaluation is made both subjectively and objectively. The subjective evaluation focuses on evaluation of the structure of the summary like informativeness of the summary, coherence, referential clarity, non-redundancy and grammar. In the objective evaluation we used metrics like precision, recall and F-measure. The result of subjective evaluation is 83.33% informativeness, 78.8% referential integrity and grammar, and 76.66% structure and coherence. This work also achieved 0.527 precision, 0.422 recall and 0.468 F-measure by using the data we gathered. However, the overall performance of the summarizer outperformed by 0.648 precision, 0.626 recall and 0.058 F-measure when compared with the previous works by using the same data used in their work.Item Afaan Oromo Word Sense Disambiguation Using Wordnet(Addis Ababa University, 11/2/2017) Tesfaye, Birhane; Assabie, Yaregal PhD)All human languages have words that can mean different things in different contexts. In the natural language processing community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word. One of the several approaches proposed in the past is Michael Lesk’s 1986 algorithm. This algorithm is based on two assumptions. First, when two words are used in close proximity in a sentence, they must be talking of a related topic and second, if one sense each of the two words can be used to talk of the same topic, then their dictionary definitions must use some common words. For example, when the words ”pine cone” occur together, they are talking of ”evergreen trees”, and indeed one meaning each of these two words has the words ”evergreen” and ”tree” in their definitions. Thus we can disambiguate neighboring words in a sentence by comparing their definitions and picking those senses whose definitions have the most number of common words. The main drawback of this algorithm is that dictionary definitions are often very short and just do not have enough words for this algorithm to work well. To overcome this problem Satanjeev Banerjee 2002 deal with this problem by adapting Lesk algorithm to the semantically organized lexical database called WordNet. Besides storing words and their meaning like a normal dictionary, WordNet also ”connects” related words together. To this end, we have developed a WSD system that identifies a sense of an Afaan Oromo ambiguous word by using information from Afaan Oromo WordNet. The system identifies the sense by checking different types of sense relationships between words that will help to identify the sense of a word, The conventional WordNet organizes nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of conventional WordNet, we used a clue word based model of WordNet. The related words for each sense of a polysemy word are referred to as the clue words. These clue words are used to disambiguate the correct meaning of the polysemy word in the given context using knowledge based Word Sense Disambiguation (WSD) algorithms. The clue word can be a noun, verb, adjective or adverb which can solve limitation of English WordNet which has limited number of cross pos relation(relation not between single part of speech ). The performance of the system is tested using 50 polysemy Afaan Oromo ambiguous words which are selected randomly. The performance of the WSD based on clue word based WordNet achieved 92%.Item Afaan Oromo Wordnet Construction Using Sense Embedding(Addis Ababa University, 2021-10-01) Henok Desalegn; Yaregal AssabieOne of the primary goals of the field of natural language processing is to create very high-quality WordNet which can be used in many domains. The main area which WordNet methods typically fall short is in handling polysemy representation. A word is polysemous when it has multiple meanings (e.g., the word bank when used in a financial context versus an ecological context). Current WordNet methods fail to handle this at all when using word embedding models for automatic WordNet construction that train just one embedding for all meanings of a word. Words have different meanings (i.e., senses) depending on the context. Disambiguating the correct sense is important and a challenging task for natural language processing. Contextualized models represent the meanings of words in context. This enables them to capture some of the vast array of linguistic phenomena that occur above the word level. In this study, we propose automatic Afaan Oromo WordNet construction using sense embedding. The proposed model includes different tasks. We perform text pre-processing in Afaan Oromo text document and train the document using a sense embedding spacy library (sense2vec) and Facebook fastText library to generate sense embedding model. The embedding result provides a contextually similar word for every word in the training set. The trained sense vector model captures different patterns. After training the data we take the trained model as input and discover different patterns that used to extract WordNet relations. We use POS tagged Afaan Oromo corpus to model WordNet. The resulting WordNet using fastText and sense2vec showed that words that are similar or analogous to each other happen together or closer in space. Related Afaan Oromo words were found closer to each other in the vector space. Morphological relatedness took the highest stake. The sense embedding has also learned the vector representation, “moti (king) - dhira(man) + dubara(woman)” resulting in a vector closer to the word “gifti(queen)”. Out-of-vocabulary words were also entertained. We got Spearman's correlation score of Rs=0.74 for each relation type, multi class text classification on the model attained 92.6% F1-score; result being fluctuated based on parameters.Item Aflatoxins, Heavy Metals, and Safety Issues in Dairy Feeds, Milk and Water In Some Selected Areas of Ethiopia(Addis Ababa University, 2/3/2018) Mesfin, Rehrahie; Assefa, Fassil (PhD)The production of wholesome milk is controlled by the quality and safety of feed supply. Aflatoxins and heavy metals are some of the major factors that affect the quality of feeds and water sources that are transferred and eventually get bio-accumulated in livestock species and humans via meat, milk and milk products. Monitoring dairy production inputs using technical tools and gathering appropriate information on perception, experience and indigenous knowledge of stake holders along the feed and milk chains are relevant in assessing how processing, storage and distribution of feeds and water sources to ensure safety of milk and milk products. The objective of this study was to determine aflatoxin B1 (AFB1) in feeds and aflatoxin M1 (AFM1) in milk and heavy metals cadmium (Cd), lead (Pb), arsenic (As) and chromium (Cr) in feeds, water, and milk samples from West Shoa, East Shoa and Hawassa, Ethiopia. A total of 205 samples consisting of 115 concentrate feeds, 45 roughage feeds and 45milk samples were collected for the detection and quantification of aflatoxin using Enzyme-linked Immunosorbent Assay (ELISA). A total of 90 samples (30 feeds, 30 water and 30 milk) were collected for determination of heavy metals using Graphite Furnace Atomic Absorption Spectrophotometer (GFAAS). Stakeholders’ perception and experience of handling feeds and water sources were evaluated by interviewing peri-urban farmers, feed processors, feed retailers and urban dairy producers using semi-structured questionnaires and field observations. The results showed half of the feed samples (81) were free from aflatoxin, and the remaining (79 samples) were within the EU standard of 5μg/kg and the USA standard of 20μg/kg. The pattern of afltoxin contamination showed that concentrate feeds were more contaminated (7.67 ± 0.80 μg/kg) than roughage feeds (0.41 ± 0.14 μg/kg); hay (0.72 ± 0.25 μg/kg) was more contaminated than straw (0.05 ± 0.05 μg/kg) and oilseed cake based concentrate feeds were more contaminated (13.09 ± 1.12 μg/kg) than concentrate feeds without oilseed cake (2.78 ± 0.66 μg/kg). The average AFB1 of feeds in Bishoftu (9.76μg/kg) was significantly higher (p<0.05) than the sampling sites in Holetta (6.33μg/kg) and Hawassa (1.19μg/kg). The AFB1 of feeds handled by dairy producers was significantly higher (p<0.05) (9.35 ± 1.04μg/kg) than feed retailers (6.91 ± 1.09 μg/kg) and 2 feed manufacturers (7.50 ± 1.43 μg/kg). The AFM1 of milk was in a range and average of 0– 0.146 μg/L and 0.054 μg/L respectively of which 29% of the milk samples did not contain aflatoxin, and 58% of them had AFM1 level within the EU permitted limit of 0.05μg/L and 42% of the samples were less than the U.S.A. recommended limit of 0.5 μg/L. The AFB1and AFM1 levels of milk samples collected from the study locations were in the order of Hawassa < Holetta < Bishoftu. With regards to heavy metals, the data showed that concentrations of heavy metals in teff straw in Holetta and Bishoftu were 1543.54 ± 318.70 μg/kg and 1486.92 ± 279.73 μg/kg, respectively. The overall concentration of heavy metals in teff straw was in the order of Cr > As > Pb > Cd. The water samples taken from Mojo areas (Eastern Shoa) showed the highest of heavy metals (43.64 μg/L - 86.89 μg/L) with very high concentration of Cr (300.56 μg/L). In general, the average concentration of heavy metals in livestock water in Eastern shoa (Akaki to Mojo) (28.08 ± 7.02 μg/L) was significantly higher (p<0.05) than the levels of heavy metals in water collected from Western Shoa (Holetta/Welmera) (1.96 ± 0.28 μg/L) and the levels of the heavy metals was in the order of Cr > As > Pb > Cd. With the exception of pH of water from Mojo Lake (10.37) and Gelan dye factory (8.9), the rest of the water samples collected from Bishoftu and Holetta areas were within the legal pH limit of 6.5-8.5 for livestock drinking. The overall concentration of heavy metals in cow milk samples was in the order of Cr > Cd > Pb > As. The concentrations of Cd and As in milk were within the permissible limits. However, 60% and 73% of the milk samples from Holetta and Bishoftu respectively for Pb and, all the milk samples in both study locations for Cr were above the permissible limits indicating poor quality of milk due to environmental pollution. The data from the interview of stakeholders showed that 91% of the farmers sometimes encountered mold formation in roughage feeds due to lack of good harvesting and stacking practices. Most of the farmers admitted to feeding light moldy feeds to their livestock by diluting with uncontaminated ones. Most of the respondents (67%) used extreme moldy feeds for firewood; and 33% of the interviewees damped the extreme moldy feeds into landfills. Farmers recognized two causes of water contaminants associated with health and production problems in livestock. Accordingly farmers from Eastern Shoa (100%) were aware of the effect of industrial effluent as the most important hazard for dairy production; whereas 66% of the farmers from Eastern Shoa and 34% of the respondents from Western shoa identified leech problems in water bodies in dry season. Farmers also had indigenous knowledge to tackle the leech problem in that 69% of the farmers used bucket for selectively scooping water 3 from the water body to exclude the leech from being consumed by animals; whereas 50% of the respondents treated animals with chopped tobacco and onion. The majority of the feed processors (64%), feed retailers (82%) and dairy producers (56%) reported that they did not use palate for placing their concentrate feeds implying that there is probability of mold contamination in times of prolonged storage. Among the respondents, 88% of feed processors, all feed retailers and most (96%) of the dairy producers recognized that wheat bran was the most mold susceptible feed ingredient. Majority of the feed processors (67%), feed retailers (73%) and dairy producers (58%) stored their concentrate feeds for a short period of about 1 month. Majority of the feed processors (74%), feed retailers (87%) and most dairy producers (91%) did not encounter mold formation in their concentrate feed because of the small amount of concentrate feed they hold and shorter storage time. To overcome mold formation in concentrate feeds, 64% of the feed processors gave enough space between stored feed and the wall. Further research needs to be undertaken along the feed and milk production and distribution chains using other techniques such as HPLC, GC and multi-mycotoxin assay using LC-MS-MS taking into account different storage conditions such as use of palate, ventilation, and duration of feed storage on aflatoxin. The effect of mold growth in feeds on nutrient composition needs to be investigated. There is also a need for further investigation on heavy metals from soils and fodder feed samples grown in similar study locations.Item Aframework for Multi-Agent Interbank Payment and Settlement(Addis Ababa University, 2009-11) Addis, Yilebes; Libsie, Mulugeta (PhD)Interbank payment and settlement systems automate transfer of fund from one bank to another bank on the order of a customer. The communication between banks involved in interbank payment and settlement is automated. Moreover, few agent-based payment systems tried to simulate the trend of incoming and outgoing payments so as to manage liquidity requirement. However, interbank payment and settlement systems developed so far are living with critical problems like gridlock, intraday liquidity management, and interfacing with autonomous legacy systems. Hence this thesis proposes a framework for Multi-Agent Interbank Payment and Settlement (MAIPS) system, which improves interbank payment and settlement system, and extends its coverage. The proposed framework interfaces autonomous banking systems with the interbank payment and settlement system. Besides, MAIPS provides solution for intraday liquidity management and gridlock problems through automated interbank lending. Thus the thesis develops a Multi-Attribute Utility Theory (MAUT) based interbank lending model. In order to secure liquidity through interbank lending, the system floats bid to borrow liquidity, evaluates bidders’ proposal, selects the best lender and agrees with the winner. This interbank lending model is simulated through the prototype called Multi-Agent Interbank Lending System (MAILS), which is developed using Java Agent DEvelopment (JADE) Framework and uses FIPA English Auction Interaction Protocol. Finally the prototype is tested using relevant information so as to clearly visualize interaction of participating banks and check correctness of the prototype. The result of this thesis will bring breakthrough in improving interbank payment and settlement systems. It will also pave the way for multidimensional complex auctions to use decision aid techniques. viii Keywords: Interbank Payment and Settlement, Cheque Clearance, Multi- Agent System, Gridlock, Intraday Liquidity Management, Collateralized Credit, and Interbank Lending.Item Amharic Document Categorization Using Itemsets Method(Addis Ababa University, 2013-02) Hailu, Abraham; Assabie, Yaregal (PhD)Document categorization or document classification is the process of assigning a document to one or more classes or categories. Many researches are conducted in the area of Amharic document categorization. The main focus of those studies is to examine different document categorization techniques and measuring their performance however itemsets method is not so far examined. This study focused to extend Apriori algorithm which is traditionally used for the purpose of knowledge mining in the form of association rules. The research focused on the basic principles of applying itemsets method to categorize Amharic documents. In addition to that the implementation of all the required tools which helps to carry out automatic Amharic Document categorization using itemsets method is developed and the algorithm is examined. Experiment results show itemsets method is an efficient method to categorize Amharic documents. The effectiveness and accuracy of the method to categorize Amharic documents is also evaluated and reported. Finally, factors affecting the performance of the proposed system and the importance of preprocessing training dataset in finding useful information are discussed.Item Amharic Document Image Retrieval Using Lingustic Features(Addis Ababa University, 10/21/2011) Yeshambel, Tilahun; Assabie, Yaregal(PhD)The advent of modern computers play important roles in processing and managing electronic information that are found in the form of texts, images, audios and videos, etc. With the rapid development of computer technology, digital documents have become popular options for storage, accessing and transmission. With the need of current fast evolving digital libraries, an increasing amount of historical documents, newspaper, books, etc. are being digitized into an electronic format for easy archival and dissemination purposes. Optical Character Recognition (OCR) and Document Image Retrieval (DIR), as part of information retrieval paradigm, are the two means of accessing document images that received attention among the IR community. Amharic is the official language of Ethiopia since 19th century and as a result so many religious and government documents are written in Amharic. Huge collections of Amharic machine printed documents are found in almost every institution of the country. It is observed that accessing those documents has become more and more difficult. To address this problem, very few number of research works have been attempted recently by using OCR and DIR methods. The aim of this research is to develop a system model that enables users to find relevant Amharic document images from a corpus of digitized documents in an easy, accurate, fast and efficient manner. So this work presents the architecture of Amharic DIR which allows users to search scanned Amharic documents without the need of OCR. The proposed model is designed after making detailed analysis of the specific nature of Amharic language. Amharic belongs to the Semitic languages and is morphologically rich language. Surface words formation involves prefixation, suffixation, infixation, circumfixation and reduplication. In this work a model for searching Amharic document images is proposed and word image features are systematically extracted for automatically indexing, retrieving and ranking of document images stored in a database. A new approach that applies one of the NLP tools which is Amharic word generator is incorporated in the proposed system model. By providing a given Amharic root word to this Amharic specific surface word synthesizer, a number of possible surface words are produced. Then, the descriptions of these surface word images are used for indexing and searching purposes. On the other hand the system passes through various phases such as noise removal, binirization, text line and word boundary identification, word segmentation and resizing to normalize different font types, sizes and styles, feature extraction and finally matching query word image against document word images. The proposed method was tested on different real world Amharic documents from different sources like magazines, textbooks and newspapers with various font styles, types and sizes. Precision-recall measures of evaluation had been conducted for sample queries on sample document images and promising results have been achieved.Item Amharic Information Retrieval Using Semantic Vocabulary(Addis Ababa University, 10/2/2019) Getnet, Berihun; Assabie, Yaregal (PhD)The increase in large scale data available from different sources and the user’s need access to information retrieval becomes more focusing issue these days. Information retrieval implies seeking relevant documents for the user’s queries. But the way of providing the queries and the system responds relevant results for the user should be improved for better satisfaction. This can be enhanced by expanding the original queries from semantic lexical resources that are constructed either manually or automatically from a text corpus. But, manual construction is tedious and time-consuming when the data set is huge. The way semantic resources are built also affects retrieval performance. Based on formal semantics the meaning is built using symbolic tradition and centered around the inferential properties of languages. It is also possible to automatically construct semantic resources based on the distribution of the word from unstructured data which applies the notion about unsupervised learning that automatically builds semantics from high dimensional vector space. This produces contextual similarity via word’s angular orientation. There have been attempts done to enhance information retrieval by expanding queries from semantic resources for non-Ethiopian languages. In this study, we propose Amharic information retrieval using semantic vocabulary. It isfigured out by considering components including text preprocessing, word-space modeling, semantic word sense clustering, document indexing, and searching. After the Amharic documents are preprocessed the words are vectorized on a multidimensional space using Word2vec based on the notion words surrounding another word can be contextually similar. Based on the word’s angular orientation, the semantic vocabulary is constructed using cosine distance. After Amharic documents are preprocessed it is indexed for later retrieval. Then the user provides the queries and the system expands the original query from the semantic vocabulary. The queries are reformulated and words are searched from indexed data that returns more relevant documents for the user. A prototype of the system is developed and we have tested the performance of the system using Amharic documents collected from Ethiopian public media. The semantic vocabulary based on the word analog prediction using the cosine metric is promising. It is also compared against the semantic thesaurus constructed with the latent semantic analysis and it increases by 17.2% accuracy. Information retrieval using semantic vocabulary based on ranked retrieval increases by 24.3% recall, and using unranked set of retrieval, 10.89% recall improvement was obtained.Item Amharic Open Information Extraction(Addis Ababa University, 3/3/2020) Girma, Seble; Assabie, Yaregal (PhD)Open Information Extraction is the process of discovering domain-independent relations by providing ways to extract unrestricted relational information from natural language text. It has recently received increased attention and applied extensively to various downstream applications, such as text summarization, question answering, and informational retrieval. Although a lot of Open Information Extraction systems have been developed for various natural language text, no research has been conducted yet for the development of Amharic Open Information Extraction (AOIE). As litrature has shown, the rule-based approach operating on deep parsed sentences yields the most promising results for Open Information Extraction systems. However, to the best of our knowledge, there is no fully implemented deep syntactic parser available for Amharic language. Therefore, in this thesis, we propose the development of a rule-based AOIE system that utilizes shallow parsed sentences. The proposed system has six components: Preprocessing, Morphological Analysis, Phrasal Chunking, Sentence Simplification, Relation Extraction, and Post-processing. In the Preprocessing, each word in the input text is labeled with an appropriate POS tag, and then well-formed and informative sentences are filtered out for further processing based on POS tags of words. The Morphological Analysis component produces morphological information about each word of input sentences. The phrasal chunking component divides the input sentence into non-overlapping phrases based on POS and morphological tags of words. The Sentence Simplification component segments the sentence into a number of self-contained simple sentences that are easier to process. In the Relation Extraction, relation instances are extracted from those simplified sentences and finally the post-processing components prints extracted relations in N-ary format. The proposed method and algorithms were implemented in prototype software and evaluated with a dataset from different domains. In the evaluation, we showed that the system achieved an overall precision of 0.88.Item Amharic Question Answering for Definitional, Biographical and Description Questions(Addis Ababa University, 2013-11) Abedissa, Tilahun; Libsie, Mulugeta (PhD)There are enormous amounts of Amharic text data on the World Wide Web. Since Question Answering (QA) can go beyond the retrieval of relevant documents, it is an option for efficient information access to such text data. The task of QA is to find the accurate and precise answer to a natural language question from a source text. The existing Amharic QA systems handle fact-based questions that usually take named entities as the answers. In this thesis, we focused on a different type of Amharic QA— Amharic non-factoid QA (NFQA) to deal with more complex information needs. The goal of this study is to propose approaches that tackle important problems in Amharic non-factoid QA, specifically in biography, definition, and description questions. The proposed QA system comprises of document preprocessing, question analysis, document analysis, and answer extraction components. Rule based and machine learning techniques are used for the question classification. The approach in the document analysis component retrieves relevant documents and filters the retrieved documents using filtering patterns for definition and description questions and for biography questions a retrieved document is only retained if it contains all terms in the target in the same order as in the question. The answer extraction component works in type-by-type manner. That is, the definition-description answer extractor extracts sentences using manually crafted answer extraction patterns. The extracted sentences are scored and ranked, and then the answer selection algorithm selects top 5 non-redundant sentences from the candidate answer set. Finally the sentences are ordered to keep their coherence. On the other hand, the biography answer extractor summarizes the filtered documents by merging them, and then the summary is displayed as an answer after it is validated. We evaluated our QA system in a modular fashion. The n fold cross validation technique is used to evaluate the two techniques utilized in the question classification. The SVM based classifier classifies about 83.3% and the rule based classifier classifies about 98.3% of the test questions correctly. The document retrieval component is tested on two data sets that are analyzed by a stemmer and morphological analyzer. The F-score on the stemmed documents is 0.729 and on the other data it set is 0.764. Moreover, the average F-score of the answer extraction component is 0.592. viii Keywords: Amharic definitional, biographical and description question answering, Rule based question classification, SVM based question classification, Document Analysis, Answer Extraction, Answer Selection.Item Amharic Question Classification System Using Deep Learning Approach(Addis Ababa University, 4/14/2021) Habtamu, Saron; Assabie, Yaregal (PhD)Questions are used in different applications such as Question Answering (QA), Dialog System (DS), and Information Retrieval (IR). However, some questions might be too complex to be analyzed and processed. As a result, systems are expected to have a good feature extraction and analysis mechanism to linguistically understand these questions. The retrieval of wrong answers, inaccuracy of IR, and crowding the search space with irrelevant candidate answers are some of the challenges that are caused due to the inability to appropriately process and analyze questions. Question Classification (QC) aims to solve this issue by extracting the relevant features from the questions and by assigning them to the correct class category. Even though QC has been studied for various languages, it was hardly studied for the Amharic language. This research studies Amharic QC focusing on designing hierarchical question taxonomy, preparing Amharic question dataset by labeling the sample questions into their respective classes, and implementing Amharic QC (AQC) model using Convolutional Neural Network (CNN) which is part of the DL approach. The AQC uses a multilabel question taxonomy that integrates coarse and fine grain categories. This multilabel class helps us to be more accurate in retrieving answers compared to the flat taxonomy. We constructed the taxonomy by analyzing our AQ dataset and also adopting the standard taxonomies that were previously studied. We have prepared the AQs in three forms: Surface, Stemmed, and Lemmatised forms. We train and test these datasets using a word vectorizer trained on surface words noticing that most interrogative words appear to be similar even when they are stemmed and lemmatized. As a result, we have achieved 97% and 90% training and validation accuracy for Surface AQs. Scoring 40% for the stemmed AQs. However, the word2vec model could not represent the lemmatized AQs appropriately. As a result, no results were obtained during training. we also tried to extract features from AQs by using different filters separately. This gave us an accuracy of 86% while requiring an increasing number of training epochs.