Computer Science
Permanent URI for this collection
Browse
Recent Submissions
- Item Distributed SDN Controller Architecture for LAN Management(Addis Ababa University, 2023-12) Zenebe Kassie; Minale AshagrieThe ever increasing demand to use information and communication technology services trails an increase in the size of networks and their heterogeneity. As network increases in size and heterogeneity, the complexity and chance to faults and errors in a network also increases for various reasons .In addition, failure of the underlying network may happen due to link and/or core element(s) failure including the worst of catastrophic incidents that devastate the network infrastructure within the administrative perimeter. Thus, network management is challenged and network administrators will have a burden. Hence, pro-active network management system is mandatory to alleviate the burden and/or address the challenge. This system is primarily utilized by administrators and operators. Additionally, ensuring sustainability, scalability and quality of services needs a system that considers all factors attributed for it. Thus, we have chosen an approach called software defined networking which comes with two general implementation choices for controller architecture:-centralized and distributed. The distributed architecture has again three choices: flat, hierarchical and hybrid among which we have chosen the hybrid hierarchical. The hybrid hierarchical works both by open flow and legacy non-open flow mode of switches as the controller may delegate some flow forwarding to the underlying devices. As a solution to this challenge, we have proposed our own architecture. We have also carefully labeled and assigned a role for each and every element in the anatomical framework of the architecture to optimize the physiological make up of the underlying infrastructure functionalities. Our architecture seems to have a loop due to redundant and mesh links we have used but it is tested and verified with various node size and topologies for its viability. This test is also further conducted for core performance indicators such as TCP-throughout and UDP-latency (jitter) variation as well. The result of the tests we have conducted is very good and promising. However, we have also a challenge regarding automatic recovery to system failure and avoidance of faulty controller failure alert messages that we have planned to do on it more in the future. In general, our effort is on the use of redundant and mesh linked controller architecture in a LAN where network service is mandatory and its interruption is super crucial.
- Item Hybrid Threat Model of STRIDE and Attack tree for Security Analysis of Software Defined Network Controllers(Addis Ababa University, 2023-10-31) Banchiaymolu Adera; Mesfin KifleSoftware Defined Network (SDN) is a network which employs software based controllers to interact with physical infrastructure and manage network traffic. It offers numerous advantages over conventional networks, including enhanced programmability, scalability and visibility. These benefits make SDN a crucial technology for addressing the evolving needs of modern networks. However, along with these benefits, SDN also introduces new security challenges due to its architectural changes. One of the main security concerns in SDN is controllers’ security. Controllers serve as a core of SDN architecture, responsible for managing and controlling the network centrally. This centrality makes them high value targets for attackers and potential single points of failure. To ensure the security of SDN, it is essential to assess and mitigate vulnerabilities in SDN controllers. In previous studies, STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege) threat model was used to analyze the security of SDN controllers. While it provides a systematic way of threat identification and categorization, it lacks granular and complete threat coverage and it has moderately high false negative rate. In this research, we addressed these limitations by proposing a hybrid threat model that combines STRIDE with attack tree. Attack tree provides a hierarchical and structured representation of attack scenarios and attack paths, enabling a more detailed analysis of threats. By integrating these two models, we aim to enhance the effectiveness and comprehensiveness of STRIDE only model. To evaluate our proposed hybrid model, we implemented it for security analysis of Ryu and POX controllers. As a result, we identified vulnerability to Denial of Service (DoS) attack in POX controller, which was not detected by using STRIDE only model used in previous studies. To further validate the effectiveness of our model, we conducted experimental test on mininet emulator. We exploited the detected vulnerability to launch DoS attacks on controllers and measured the impact on performance metrics bandwidth and delay. Result indicated that, both controllers are susceptible to DoS attack. However, POX controller exhibited a more significant degradation in bandwidth, a decrease of around 6.98Gbps. In contrast, the Ryu controller exhibited a decrease of around 0.74Gbps. The impact on traffic delay (jitter) was relatively small for both controllers, with values of 0.0016ms and 0.004ms for Ryu and POX, respectively. These findings show enhanced efficacy of our proposed hybrid threat model in assessing the security of SDN controllers.
- Item Ensemble-Based DDoS Attack Detection Model for Software-Defined Networks by Utilizing Flow-Based Features(Addis Ababa University, 2023-10) Ermias Tsegu; Ayalew BelayIn this modern era, networking has advanced in a swift manner. The need for businesses to incorporate the benefits of having enormous amount of dynamic applications, services, physical objects, machines, etc is skyrocketing. Therefore, the networking infrastructure become more complex in terms of networking devices and resource utilization. Therefore, the traditional networking paradigm becomes ineffective and inefficient to handle those requirements. As a result, Software Defined Networking (SDN), gets more consideration from researchers and practitioners. In contrast to the traditional networking paradigm, SDN has efficient resource utilization, simple network management capability, better performance, network virtualization capability, and network programmability capability. But, it is with some serious network security issues like network tampering, unauthorized access, flow rule conflict, poor controller deployment, and Distributed Denial of Service attack (DDoS). Among these security issues DDoS is one of the devastating attacks on SDN. As a result, several studies are conducted to detect DDoS on SDN networks by utilizing statistical approaches, traditional machine learning (ML) approaches, and state-of-the-art techniques like DL. The traditional ML techniques are less efficient in contrast with the state-of-the-art approaches like DL. On the other hand, the state-of-the-art techniques are computationally complex. To overcome this problem, we proposed an ensemble-based DDoS detection model for SDN by utilizing flow-based dataset. The experiment is conducted using the InSDN dataset. The dataset contains the normal group that contains the normal traffic, the metasploitable-2 group which contains attacks that target the metasploitable-2 server, and the Open Virtual Switch (OVS) group contains attacks that target the OVS machine. Because the dataset contains attacks like DoS, DDoS, Web Attacks, R2L, Malware, Probe, and U2R attacks, it is a must for us to separate the DDoS attack data to prepare it for our purpose. Furthermore, we split the dataset into 70% for training, and 30% for testing purposes. The result shows that the adaptive boosting ensemble technique has the highest accuracy with a value of 100%. But, when it comes to latency the gradient boosting algorithm has the minimum latency with a value of 60.6 ms. On the contrary, the KNN_DT-based stacking algorithm has the highest latency with 119,431.5 ms
- Item Teff Scarcity Prediction Model for Ethiopian Context Using Multiple Linear Regression(Addis Ababa University, 2023-07) Taye Mohammed; Ayalew BelayScarcity prediction is vital in avoiding economic and political disturbances from food resource scarcity. In addition, it allows for well-managing resources for maintaining and improving livelihood, population's quality of life, use of budget, reducing cost, boosting productivity, seeing potential resource conflicts in the early stage for more responsive mitigation, reducing wastage of resources, and contributing sustainable and reasonable growth. Hence, the purpose of this paper is to use multiple linear regression models to predict the scarcity of Teff resources in the context of Ethiopia. However, predicting Teff's scarcity based on factors like resource consumption, population growth, productivity, and other factors is a significant problem to address. Thus, in this thesis work, we present a system that predicts Teff scarcity. We use a multilinear regression approach to design the system. The teff scarcity prediction model consists of three components: the preprocessing components, the train-validate-test component, and the prediction component. The preprocessing part consists of data cleansing and data transformation. The train-validate-test consists of the training, validating, and testing data partition. The prediction part removes insignificant attributes and trains, validates, and tests the designed models. The preprocess component receives the raw dataset and performs data cleansing and transformation. The train-validate-test component performs partitioning into a train, validate, and test dataset. The prediction component predicts teff scarcity with the partitioned data and evaluates the model with the result. The experiment result shows scarcity prediction statistically based on mean absolute error, root mean squared error, and R squared values. Hence, the experiment produced a mean absolute error of 7%, a root mean squared error of 6.43%, and an R squared of 97.07%. The designed model can predict Teff scarcity with an accuracy of 97.07%.
- Item Afaan Oromo Wordnet Construction Using Sense Embedding(Addis Ababa University, 2021-10-01) Henok Desalegn; Yaregal AssabieOne of the primary goals of the field of natural language processing is to create very high-quality WordNet which can be used in many domains. The main area which WordNet methods typically fall short is in handling polysemy representation. A word is polysemous when it has multiple meanings (e.g., the word bank when used in a financial context versus an ecological context). Current WordNet methods fail to handle this at all when using word embedding models for automatic WordNet construction that train just one embedding for all meanings of a word. Words have different meanings (i.e., senses) depending on the context. Disambiguating the correct sense is important and a challenging task for natural language processing. Contextualized models represent the meanings of words in context. This enables them to capture some of the vast array of linguistic phenomena that occur above the word level. In this study, we propose automatic Afaan Oromo WordNet construction using sense embedding. The proposed model includes different tasks. We perform text pre-processing in Afaan Oromo text document and train the document using a sense embedding spacy library (sense2vec) and Facebook fastText library to generate sense embedding model. The embedding result provides a contextually similar word for every word in the training set. The trained sense vector model captures different patterns. After training the data we take the trained model as input and discover different patterns that used to extract WordNet relations. We use POS tagged Afaan Oromo corpus to model WordNet. The resulting WordNet using fastText and sense2vec showed that words that are similar or analogous to each other happen together or closer in space. Related Afaan Oromo words were found closer to each other in the vector space. Morphological relatedness took the highest stake. The sense embedding has also learned the vector representation, “moti (king) - dhira(man) + dubara(woman)” resulting in a vector closer to the word “gifti(queen)”. Out-of-vocabulary words were also entertained. We got Spearman's correlation score of Rs=0.74 for each relation type, multi class text classification on the model attained 92.6% F1-score; result being fluctuated based on parameters.
- Item A Model for Recognition and Detection of the Counterfeit of Ethiopian Banknotes using Transfer Learning(Addis Ababa University, 2024-06) Hailemikael Tesfaw; Ayalew BelayPaper currency recognition systems play a pivotal role in various sectors, including banking, retail, and automated teller machines (ATMs). This paper presents a novel approach to the design and development of a paper currency recognition system using customized deep learning techniques. The proposed system utilizes image-processing algorithms to extract features from currency images, followed by customized convolutional neural network models for classification and detection of the counterfeit. The system is trained on a diverse dataset of currency images to ensure robustness and accuracy in recognizing various denominations and currencies. We implemented feature learning techniques architectures. To obtain the best accuracy and efficiency we used RLUs and Softmax as an activation, Adam optimizer, and sparse categorical cross-entropy as a loss function for both as a training strategy. The data was collected from the National Bank of Ethiopia, Commercial Bank of Ethiopia, NIB International Bank, and Bank of Abyssinia. From the experimental results of the alex_customed-design network, 99.82% accuracy is recorded.
- Item An Integrated Automation Software Testing Framework to Support Behavioural-Driven Development Approach(Addis Ababa University, 2024-10) Demiss Mammo; Ayalew BelayMany software products fail due to poor quality caused by misunderstandings between producer and customer. To avoid these gaps behavioral driven development is the recommended approach. It encourages collaboration and focuses on delivering software that meets the end user’s or customer’s expectations and business objectives. Testing either manual or automated is an integral part of any software development approach. So, this is true for behavioral driven development but test automation with the appropriately integrated framework is a good choice for testing that is repetitive on multiple versions of the software product. This enables the tester to run a large number of tests in a short period of time and deliver high-quality software products. Still, existing framework needs enhancement in execution speed and integration to support behavioural driven development approach. This study presents an integrated automation software testing framework that supports and is suitable for a behavioural driven development approach to enhance the test execution speed and effectiveness of test automation tasks. The key components of the proposed framework are framework configuration, page object repository, behavioural driven development, version control, continuous integration, automated test scripts, application under test, test execution and reporting. The proposed framework is implemented using cucumber, cucumber step definition generator, visual studio code, JavaScript, Jenkins, cypress and cucumber test reporting tools. The framework can translate human-readable scenarios or behaviors/features of the software into executable test code through automatically generated step definitions, used to perform requirement analysis, retesting, regression testing, acceptance testing, automatic test execution and reporting by aligning automated tests with business requirements and user expectations. As the experiment result shows we improved the test execution time required to execute a single test case as compared with existing framework. Expert evaluation result also shows that 83% of the respondents have agreed on the suitability, integration, functionality and content of the framework. The proposed framework offers a valuable solution for organizations seeking to adopt BDD methodologies and improve their software testing automation processes. In the future, it is important to add artificial intelligence capability, API testing, performance testing and improve the framework.
- Item DEACT: Hardware Solution to Rowhammer Attacks(Addis Ababa University, 2024-05) Tesfamichael Gebregziabher; Mohammed IsmailDynamic Random-Access Memory (DRAM) technology has advanced significantly, resulting in faster access times and increased storage capacities by shrinking the size of memory cells and tightly packing them on a chip. However, as the scaling of DRAM continues, it presents new challenges and considerations that need to be addressed. Smaller memory cells and the proximity between them have led to circuit disturbance errors, such as the Row-hammer problem. These errors can be exploited by attackers to induce bit flips and gain unauthorized access to systems, posing a significant security threat. In this research, we propose DEACT, a counter-based hardware mitigation approach designed to tackle the Row-hammer problem in DRAM. It moves all frequently accessed rows to a safety sub-array. DEACT aims to prevent further row activations and maintain hot rows, effectively eliminating the vulnerability. Furthermore, our counter implementation requires smaller chip area compared to existing solutions. Moreover, We introduce DDRSHARP, a cycle-accurate DRAM simulator that simplifies configuration and evaluation of various DRAM standards. DDRSHARP provides over 1.8x simulation time reduction compared to contemporary simulators. Its performance is optimized by avoiding infeasible iterations, minimizing branch instructions, caching repetitive calculations and other optimizations.
- Item A Hybrid Deep Learning-Based ARP Attack Detection and Classification Method(Addis Ababa University, 2023-12) Yeshareg Muluneh; Solomon GizawTo map the Internet Protocol (IP) addresses to the Media Access Control (MAC) addresses and vice versa in local area network communication, the Address Resolution Protocol (ARP) is the most crucial protocol. ARP, however, is an unauthenticated protocol that lacks security features and is stateless in nature. Therefore, ARP is vulnerable to many attacks, and it can be easily exploited to gain unauthorized access to one's sensitive data and transmit bogus ARP messages to poison the ARP caches of the hosts within the local area network. These attacks may result in a loss of data integrity, confidentiality, and the availability of an organization's information. Many researchers have struggled to detect ARP attacks using different methods. However, some of these papers are not time-effective, require more human effort and involvement, and have high communication overhead. The other works use machine learning and deep learning methods, which have better solutions for detecting ARP attacks. However, those approaches have a significant false alarm rate of 13%, a low attack detection rate, and a classification accuracy of 87%. This thesis work aims to solve those problems using a hybrid deep learning-based ARP attack detection and classification method. In this work, we used a Sparse Autoencoder for important feature extraction and dimensionality reduction for input data and a Convolutional Neural Network for attack detection and classification to achieve the highest attack detection rate and classification accuracy with a minimized false alarm rate. To evaluate the performance of the proposed model, we used an open-source benchmark NSL-KDD dataset for training and testing. The results obtained by the implementation and evaluation are measured in comparison with a single Convolutional Neural Network model with different evaluation metrics. Hence, the proposed approach scores the highest results for attack detection rate of 98.97%, classification accuracy of 99.26%, and minimum false alarm rate of 0.74%.
- Item Design of Searchable Encryption with Refreshing Keyword Search Using Pairing-Based Cryptography(Addis Ababa University, 2024-10) Kuma Bekele; Minale AshagrieTo maintain data security and privacy, the Public Key Encryptions with Keyword Search (PEKS) scheme has been implemented. They offer search capabilities for encrypted data. However, because the Key Generating Center (KGC) knows the target users' private key, the existing PEKS schemes are vulnerable to key-escrow issues. The Certificate-Less Public Key Encryptions with Keyword Search (CL-PEKS) scheme was created to address the key escrow problem in PEKS schemes. However, refreshing keyword searches are not considered by the CL-PEKS schemes that are currently existing. As a result, the target server can launch keyword-guessing attacks and store search trapdoors for system keywords. By appending date information to the encrypted data and keyword, we proposed the certificate-less based Searchable Encryption with a Refreshing Keyword Search (SERKS) scheme. We designed the system model and algorithms for the proposed certificate-less based SERKS using pairing-based cryptography. We also developed its prototype in the case of a web-based e-mail system by using Java Pairing-Based Cryptography (JPBC) library. The security hardness of the proposed scheme is based on the hardness of the Bilinear Diffie-Hellman (BDH) problem assumption. We assessed the suggested scheme's performance with respect to time complexity in terms of both communication and computational costs. The experimental results demonstrate that the suggested SERKS scheme has a lower computational cost than the two related schemes during the key generation and testing phases when compared to the earlier related work. Additionally, it has lower communication costs.
- Item Development of Text-to-Speech Synthesis Model for Afaan Oromoo Using Transformer Neural Network(Addis Ababa University, 2025-03) Bayisa Bedasa; Yaregal Assabie (PhD)Text-to-speech (TTS) is a process of converting written text into spoken words. It analyzes the incoming text, processes linguistic data, and produces audio output using algorithms. TTS systems are widely utilized in applications such as virtual assistants, accessibility tools for individuals with visual impairments, and language learning software. Afaan Oromoo is a Cushitic language mostly spoken in Ethiopia and other parts of Africa and serves as an essential means of communication for the Oromo people. For Afaan Oromoo, developing a TTS system is essential for enhancing accessibility and promoting the use of the language in digital environments. This study focuses on a transformer-based neural network model technique for Afaan Oromoo TTS. The model architecture comprises an encoder-decoder structure. The encoder processes input text by converting it into a contextualized representation, while the decoder generates speech waveforms from this representation. We enhanced the model with multi-head attention mechanisms to capture long-range dependencies in the input text, improving prosody. Additionally, we employed a HiFi-GAN-based vocoder for converting the model's output into high-fidelity audio waveforms, enhancing the overall quality of the synthesized speech. Utilizing the transformer architecture, the implementation is carried out in Python. We have produced 17 hours of audio dataset and their corresponding text transcription from the Afaan Oromoo speech corpus by a male speaker. The transformer-based text-to-speech synthesis architecture has outperformed the previously done model based on BLSTM-RNN for Afaan Oromoo language TTS, whose results are 3.77 and 3.76 in terms of intelligibility and naturalness, respectively. We used the Mean Opinion Score (MOS) to assess naturalness and intelligibility subjectively. Experimental results indicate that our transformer-based TTS system achieved a MOS score of 4.21 for naturalness and 4.23 for intelligibility, reflecting a commendable performance level. Our model also enables prosody modeling with user input parameters to generate deterministic speech, positioning it as a state-of-the-art solution.
- Item Development of Spell Checker for Guragina Language(Addis Ababa University, 2024-03) Mengistu Gebre; Yaregal AssabieA spell checker is an essential tool in Natural Language Processing (NLP). Its purpose is to identify and correct spelling errors in text, providing suggestions for correct spellings in a specific language. Spelling errors can be categorized into two types: non-word errors and real-word errors. Non-word errors are misspelled words that have no meaning in the particular language, while real-word errors involve words that exist in the language but are used incorrectly in terms of semantics and syntax. The research focused on non-word error detection as a strategic decision, given the complexity and limited resources available for the Gurage language, also known as Guragina. This language consists of over thirteen varieties and different orthographies, but there is a modern standard. Currently, there is no existing spell checker for any Guragina Language varieties or the standard. Addressing non-word errors first provides a solid foundation before tackling the more challenging task of real-word error detection and correction. This phased approach allows researchers to make meaningful progress on this under-resourced language, rather than attempting to solve the entire spell checking problem at once. The intention is to use the non-word spell checker as a starting point, then leverage that knowledge to progressively tackle real-word error handling. This work introduce a non-word spell error checker for the standard Guragina Language. The system detects and corrects errors using Ratcliff algorithms for identification and distance calculator techniques for correction. The prototype of the system was developed using Python. We evaluate the performance of the system using metrics such as accuracy of 98.27%, precession of 98.07%, recall of 97.75%, and F1 Score of 95.45%. Future work includes enhancing rule definitions by incorporating word classes, handling exceptions, adding supplementary spell checker functionalities, and expanding the system to encompass real-word errors.
- Item Jamming Attack Detection and Classification Using Exponentially Weighted Moving Average and Random Forest in Wireless Sensor Networks(Addis Ababa University, 2024-10) Alemayehu Ebissa; Mulugeta LibsieWireless sensor networks are widely used in environmental monitoring, industrial automation, healthcare, and smart cities for data collection, real-time monitoring, and automated decision-making. These networks, consisting of randomly distributed autonomous nodes, are vulnerable to jamming attacks where malicious entities disrupt network transmissions by emitting interfering signals. Existing detection methods typically rely on either statistical or machine learning-based approaches, each with significant limitations: statistical-based methods are prone to high false alarm rates, while machine learning-based methods impose computational overhead on resource-constrained nodes. To address these limitations, this thesis presents a two-level jamming attack detection and classification method that combines the strengths of both approaches. The method integrates an Exponentially Weighted Moving Average (EWMA) for lightweight detection with a Random Forest classifier for accurate jamming attack classification. The approach begins with feature selection, utilizing key features such as the Received Signal Strength Indicator (RSSI) and Packet Error Rate (PER), which can be easily obtained without adding significant overhead on sensor nodes. The method consists of a training phase and a testing phase. In the training phase, the dataset is processed through the EWMA computation to smooth the time-series data, followed by threshold calculation. The EWMA-smoothed data is then used to train the Random Forest classifier. In the testing phase, the testing dataset also passes through the EWMA computation, and the EWMA-based jamming detection determines if a jamming attack is occurring by comparing against a predefined threshold. Once potential jamming is detected, the system transitions into the classification of the three jamming types: constant, periodic, or reactive jamming. Experimental evaluation demonstrates that our method achieves a 99.91% detection rate and 99.26% accuracy in jamming classification. These results show significant improvements over existing methods, particularly in reducing false positives while maintaining high detection accuracy.
- Item A Framework for Detecting Multiple Cyberattacks in IoT Environment(Addis Ababa University, 2025-02-25) Yonas Mekonnen; Mesfin Kifle (PhD)The Internet of Things refers to the growing trend of embedding ubiquitous and pervasive computing capabilities through sensor networks and internet connectivity. The growth and expansion of newly evolved cyberattacks, network patterns and heterogeneous nature of cyberattacks trend has become the warfare across the globe and challenges to apply single layer cyberattacks detection techniques to the Internet of Things. This research work identified the lack of cyberattacks detection framework as the major gap for detection of multiple cyberattacks such as denial of services, distributed denial of services, and Mairi attacks while it includes multiple parameters at the same time. The proposed framework contains three modules; data acquisition and preprocessing module that is responsible for capturing and pre-processing the captured data and ready for the construction of the model, then the attack detection module which is the core engine that orchestrates the detection of cyberattacks, the third module notifies and displays the results in a dashboard. This research study used multiple parameters including multiple attack classes, network packet patterns, and three scaler types namely no scaler, MinMax, and Standard, and regardless of the defined parameters used, minmax scaler followed by standard scaler gives better detection performance than models trained with no scaler. The proposed framework is trained and evaluated with different models including CNN, Hybrid, FFNN, and LSTM provides a result of 91.42%, 82.75%, and 78.38% ,74.83% detection accuracy respectively where it is observed that CNN model outperforms the optimal results among followed by hybrid and FFNN.
- Item Transformer-Based Machine Translation System Model from Geez to Tigrigna(Addis Ababa University, 2024-06) Aberash Berhe; Yaregal AsabieThis thesis presents the first attempt at building a transformer based neural machine translation system for the language pair of Geez and Tigrigna. Geez and Tigrigna are closely related Semitic languages, with Geez being the liturgical language of the Eritrean and Ethiopian Orthodox churches, and Tigrigna being a widely spoken language in Eritrea and parts of Ethiopia. Due to the lack of publicly available parallel corpora for this language pair, the thesis describes the manual collection and curation of a new Geez-Tigrigna parallel dataset, which consists of 10,362 sentence pairs. This process is detailed as it proved to be a laborious and time-consuming task given the limited availability of translated text between the two languages. The architecture of the proposed neural machine translation system is based on the transformer model, which has shown state-of-the-art performance on many language pairs. To address the challenges of translating between low-resource languages like Geez and Tigrigna, an alignment-based approach is integrated into the standard transformer architecture. This alignment mechanism aims to better capture the relationships between source and target language elements during the translation process. The word-level alignments between the parallel sentences are done manually. Experiments are conducted to compare the performance of attention-based recurrent neural network model, a standard transformer model, and the proposed alignment-augmented transformer model. The results show that the standard transformer model achieved a BLEU score of 54%, outperforming the RNN model, which had a BLEU score of 46%. Further improvements were made by integrating the alignment mechanism into the transformer architecture, resulting in an alignment-augmented transformer model that achieved a BLEU score of 63%. These findings demonstrate the feasibility of building neural machine translation systems for low-resource language pairs like Geez and Tigrigna, and that the proposed alignment-based modifications to the transformer architecture can lead to significant improvements in translation quality compared to the standard transformer model.
- Item Automatic Amharic Text Catigorization(Addis Ababa University, 2007-03) Yohannes Afework; Mulugeta Libsie (PhD)Rapid developments in Information and Communication Technology are making available huge amount of data and information. Much of these data is in electronics forms (like the more than billion documents in the World Wide Web). Usually these data do not have a standard structure like that of the relational database. Much of the data are unstructured or semi-structured and can generally be considered as a text database. Text databases are showing accelerated growth throughout the world. As the result, there is an active field of study in text mining to facilitate the extraction of u useful and relevant information from text databases. The text data In local languages is also increasing fast, requiring text-processing tools for text documents to be available in local languages. This is true for Amharic also, as can be surmised from the recent boom of online newspapers. magazines, data in electronics storage, etc. To facilitate the retrieval of useful and relevant information from Amharic documents, a number of researches on automatic processing of Amharic text have recently been conducted. This research work in Automatic Amharic Text Categorization is an effort to contribute in this direction. Automatic classification of text data requires that documents are represented by feature words. Representing a document by relevant feature words is an important pre-processing step for automatic classification; it often determines the efficiency and accuracy of the classification. Standard pre-processing tools and methods are therefore very important for automatic classification. Because of the lack of standard in the Amharic writing system and unavailability of Amharic text processing tools, the focus of the research was on developing a document-pre-processing scheme which facilitates for an efficient automatic classification of Amharic documents. To this end much a ttention was given to the processing of the source data by developing and enhancing the following tools. The tools are specific to the source data - Amharic news documents from ENA. • A tool to correct word spelling variations. Focusing on spelling variation due to pronunciation differences. • Enhancement to the suffix and prefix removal tool developed in a previous study, so that it can perform semantic analysis before stripping-off affixes from words. • A tool to correct word variations due to gender marker suffixes. • A tool to correct word variations due to number marker suffixes. • A tool to merge com pound words (when they may result In semantic loss if separated) written as separate words. The use of these tools (which enabled 10 to 30 % feature reduction) in addition to other tools and data reduction methods helped to analyze the huge source data (69,684 news items after data cleaning) and measure classifier performances. Because of the high dimensionality of the source data, classifier algorithms that are suitable for high-dimensional data, Decision Tree and Support Vector Machine (SVM) classifiers were selected for the research experiment. The open source Weka package is used for the automatic classification of the preprocessed data. Out of the many classifier algorithms available in Weka, the Logic Model Tree (LMT) and the Library of SVM (LibSVM) classifiers were used for performance testing. Both LMT and LibSVM classifier showed good classification accuracy correctly classifying 79.72% and 8l.15% of the test instance into the 15 news categories, respectively. However, the computational cost of the automatic classification was very high - taking several hours in high capacity computers (Computers with 512 MB RAM and 3.7 GHz speed). The classification performance measures indicate the need for additional works in developing tools and methods for mining Amharic data.
- Item Term Re-Weighting Based Query Expansion Approach for Amharic Information Retrieval(Addis Ababa University, 2014-02) Zelalem Addis; Dereje Teferi (PhD)This research has been conducted, aiming at augmenting the precision while lessening the original recall of an Amharic IR system. The main reason for performing query expansion is to provide relevant document as per users query that can satisfy their information domain area. They mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with to the results found from an IR system. Some of the causes for this type of problem are, polysemous and synonymous terms, which require an integration of query reformulation strategy to the IR system. The present study has explored term re-weighting based query expansion approaches that integrate term re-weighting with Statistical Co-occurrence analysis, bi-gram analysis and bi-gram thesaurus methods. In this approach, the users relevance feedback are represented as vector respectively, and the similarity between them can be obtained by calculating the vector similarity. Then we re-weight the terms through one single document and the entire document set using Rocchios reweighting scheme respectively final weight of the term can be selected as query expansion terms and then fed to the three query term, regardless of their position. Term re-weighting, three proposed query expansion techniques were integrated to an information retrieval system. Then test result showed that bi-gram method outperformed the other two and scored 2% improvement in total F-measure. The performance of the system can further be improved by designing ontology based query expansion in order to control expanding terms that are polysemous by themselves.
- Item Coreference Resolution for Amharic Text Using Bidirectional Encoder Representation from Transformer(Addis Ababa University, 3/4/2022) Bantie, Lingerew; Assabie, Yaregal (PhD)Coreference resolution is the process of finding an entity which is refers to the same entity in a text. In coreference resolution similar entities are mention. The task of coreference resolution is clustering all similar mentions in a text based on the index of a word. Coreference resolution is used for several Natural Language Processing (NLP) applications like machine translation, information extraction, name entity recognition, question answering and others to increase their effectiveness. In this work, we have proposed coreference resolution for Amharic text using bidirectional encoder representation from transformer (BERT). This method is a contextual language model that generates the semantic vectors dynamically according to the context of the words. The proposed system model has training and testing phase. The training phase includes preprocessing (cleaning, tokenization and sentence segmentation), word embedding, feature extraction Amharic vocabulary, entity and mention-pair and coref model. Like training phase, testing phase has its own step such as preprocessing (cleaning, tokenization and sentence segmentation) and coreference resolution as well as Amharic predicted mention. The use of word embedding in the proposed model is that it represent each word into a low dimension vector. It is a feature learning technique to obtain new features across domains for coreference resolution in Amharic text. Necessary informations are extracted from word embedding and processed data as well as Amharic characters. After we extract important features from training data we build a coreference model. Moreover, in the model bidirectional encoder representation from transformer is used to obtain basic features from embedding layer by extracting various information from both the left and right direction of the given word. To evaluate the proposed model, we conduct the experiment using Amharic dataset, which is prepared from various reliable sources for this study. The commonly used evaluation metrics for coreference resolution task are MUC, B3, CEAF-m, CEAF-e and BLANC. Experimental result demonstrate that the proposed model outperformed state-of-the-art Amharic model achieving 80%, 85.71%, 90.9%, 88.86% and 81.7% F-measure values respectively on the Amharic dataset.
- Item Deep Learning Based Emotion Detection Model for Amharic Text(Addis Ababa University, 8/26/2021) Tesfu, Eyob; Belay, Ayalew (PhD)Emotions are so important that whenever we need to make a decision, we want to feel other‟s emotions. This is not only true for individuals but also for organizations. Due to the rapid growth of internet peoples expirees their emotions using different social media networks, reviews, blogs, online and so on. The need for finding relevant sources, extracts related sentences with emotion, summarizes them and organize them to useful form is becoming very high. Emotion detection can play an important role in satisfying these needs. The process of emotion detection involves categorizing emotional sentences into predefined categories such as sadness, anger, disgust, happiness, so on based on the emotional terms that appear within the comment. So that it‟s difficult to manually identifying emotion of a million of users and aggregating them towards a rapid and efficient decision is quite a challenging task due to the rapid growth of Amharic language usage in social media. In this research work, an emotion detection model is proposed for determining the emotion expressed in the Amharic texts or comment. In this study, we proposed deep learning based emotion detection model for Amharic text using CNN with word embedding. The proposed model includes different tasks. The first task is text pre-processing which consists of commonly used text pre-processing tasks in many natural language processing applications. We perform text pre-processing in Amharic text and train the document using a word embedding in order to generate word embedding model. The embedding result provides a contextually similar word for every word in the training set then we implement our CNN model for emotion classification. The common evaluation metrics such as accuracy, recall, F1 score and precision were used to measure our proposed model performance. Deep learning based emotion detection model for Amharic text prototype is developed and used to tests the system performance using the collected Amharic text comments. Finally, this study with four categories (sadness, anger, disgust, and happiness) of classification shows a result of 71.11% accuracy. Also did better when the number of classification is two (positive and negative) shows result of 87.46% accuracy. We also evaluate our model using RNN to compare with our CNN model.
- Item Semantic Role Labeling for Amharic Text Using Deep Learning(Addis Ababa University, 8/17/2021) Meresa, Bemnet; Assabie, Yaregal (PhD)Semantic Role Labeling (SRL), the task of automatically finding the semantic roles of each argument corresponding to each predicate in a sentence, is one of the essential problems in the research field of Natural Language Processing (NLP). SRL is a shallow semantic analysis task, and an important intermediate step for many NLP applications, such as Question Answering, Machine Translation, Information Extraction and Text Summarization. Feature-based approaches to SRL are based on parsing output, often using lexical resources, and require heavy feature engineering. Errors encountered in the parsing output can also propagate to the SRL output. Neural-based SRL systems, in contrast, can learn the intermediate representations from raw text, bypassing the manual feature extraction task. Recent SRL studies using Deep Learning have shown improved performance over feature-based systems for the English, Chinese and other languages. Amharic exhibits typical Semitic behaviors that pose challenges to the SRL task, such as, rich morphology, and multiple subject-verb-object word orders. In this work, we approach the problem of SRL for the language using deep learning. The input is raw sentence with words represented using a concatenation of word, character, and fastText-level neural word embeddings to capture the morphological, syntactic and semantic information of the words in sentences, and requires no intermediate feature extraction tasks. We have used a bi-directional Recurrent Neural Network (RNN) with Long-Short Term Memory (LSTM) to capture the bi-directional (for argument identification) and long-range (for argument boundary identification), and a conditional random field with viterbi-decoding to implement the SRL system for the language. The system was trained on 8000 instances and tested on 2000 instances, and achieved an accuracy of 94.96% and F-score of 81.2%. We have manually annotated the sentences with their corresponding semantic roles, and future works can consider improving the quality of the data and experiment feature representations using contextual embeddings for improved performance.