Computer Engineering

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 224
  • Item
    Testing the Invariance of Skills and Strategies Developed by Artificial Agents under Different Sensory Modalities
    (Addis Ababa University, 2024-06) Meseret Gebremichael; Menore Tekeba (PhD)
    Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals by developing artificial intelligence agents. Artificial Intelligence (AI) agent must be aware of the external environment to understand or to execute tasks. The interaction between an agent and the environment is achieved by the aid of input sensory modality (image, sound, lidar or other input modality) using metrics reward sizes they collect during reinforcement learning. To do so, the we used two algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) for the experiments. PPO is a reinforcement learning algorithm that is used to train agents to perform tasks in environments through trial and error. It is designed to optimize policies, which are the strategies or behaviors that the agents use to make decisions. PPO aims to find the optimal policy by iteratively updating the parameters based on the collected experiences from interactions with the environment. SAC is a reinforcement learning algorithm used for training agents in environments with continuous action spaces. It is an extension of the actor-critic framework that combines the advantages of both policy optimization and value estimation methods. During the experiment researcher developed the agents and the environment model in the project and selected the appropriate RL method, implemented and designed the metrics of the learning performance of the agents depending on their nature. We demonstrate agents' varied training performances under different sensory modalities, with optimal outcomes observed when combining multiple modalities. Despite these differences, the study underscores agents' adaptability across sensory inputs, advancing our understanding of cross-modal learning in AI. Also, the researcher has shown the invariance of sensory modality results on the learning skills and strategies of an agent and developed simple game like reaching the goal with different input sensory information to test the agent skill and strategies under different perceptual modalities.
  • Item
    Vision to Auditory Substitution for an Artificial Agent
    (Addis Ababa University, 2025-01) Semira Mohammed; Menore Tekeba (PhD)
    Sensory substitution technology converts raw visual input into auditory soundscapes, allowing individuals to “see” with sound. However, mastering this skill requires significant cognitive adaptation, extensive training, and practical application in realistic, everyday scenarios. Experiments with humans have shown the potential for auditory substitution of vision, but these efforts are limited by high costs, ethical concerns, and the risk of unintended side effects, such as impaired auditory skills. To address these challenges, this study develops a Vision-to-Auditory Sensory Substitution system for artificial agents. By simulating sensory substitution in a controlled reinforcement learning (RL) framework, this approach eliminates the need for human experimentation while retaining the ability to explore learning dynamics and decision-making behaviors. Using the Proximal Policy Optimization (PPO) algorithm, agents were trained in two OpenAI Gym environments—CarRacing-v2 and LunarLander-v2—to compare the performance of vision-based and auditory-based agents. The results demonstrate that auditory agents, despite inherent challenges in interpreting sound-encoded visual inputs, achieved means rewards of 427.91 in CarRacing-v2 environments and 259.85 in the LunarLander-v2 environment over 100 episodes. These findings highlight the potential of sensory substitution systems in enabling artificial agents to act effectively using auditory cues. This research contributes to advancing assistive technologies while addressing the limitations and risks of human-based sensory substitution experiments.
  • Item
    A Top- Down Chart Parser for Ge’ez Sentences
    (Addis Ababa University, 2024-12) Nega Meareg; Surafel Lemma (PhD)
    Parsing is the process of breaking down a sentence into its part of speech or words like verb, noun, preposition, adjective and so on. Parsing plays an important role in enhancing the performance of numerous natural languages processing (NLP). Here this work is designed to parse geez language on the approach of a Top-Down chart Parser for Geez language sentences using the context free grammar rule (CFG). We reviewed various parsing approaches for different languages, to achieve our objective. How ever, Geez language parsing remains challenging dueto the absence of annotated dataset. To address this gap, we collected a dataset for sentence parsing from the well-known book Mezumere Dawit.Given the lack of the lack of pre-existing labled data; we collaborated with a language expert (Amanuel), an instructor in the Geez Department at Aksum University, to ensure lingustic accuracy. The expert prepared the dataset in a format suitable fstablished grammatical rules ford sentence construction based on verb and noun phrase structures, and manually parsed the sentences. This study presents a Top-Down parsing approach for Geez language sentences; addressing the challenges posed by the language’s unique morphological and synthetic structures. Using a dataset of 500 sourced from the book ’Mezumure Dawit” the parser correctly parsed 470 sentences, achiving a parsing accuracy of 94%. The results were validated through the manual parsing, where 490 sentences were parsed manually, with 470 sentences matching the parser’s output. The performance of the parser was evaluated using standard metrics, including precision, recall, F1 score, and Experimental results show the effectiveness of the proposed method in parsing Geez language sentences. Thrid study contributes a foundational step towards computational processing of Geez language, with potential applications in machine translation and historical analysis.
  • Item
    A Deterministic Approach to Tri-Radical Amharic Verb Derivatives Generation
    (Addis Ababa University, 2022-02) Samrawit Kassaye; Yalemzewd Negash (PhD)
    Morphological synthesis or generation is a process of returning one or more surface forms from a sequence of underlying (lexical) forms. Today, synthesizers of different kinds have been developed for languages that have relatively wider use internationally. Amharic is the second most populous Semitic languages after Arabic. But it is not exploited in the digital world. In this research paper, a rule-based approach to morphologically derive or generate Amharic words from tri-radical verbs to finally generate rich Amharic lexicon is elaborated. This work utilizes two data sources namely Amharic word list and tri-radical verbs. The Amharic word list file contains more than 450,000 unique Amharic words. The tri-radical verb data source contains more than 350 unique tri-radical verbs. The proposed method works towards identifying rules from existing Amharic words after analyzing with tri-radical verbs. The new feature identified is applying index changing the letter of tri-radical verbs. Index changing (adding vowel to consonant letters) is one of the approaches used in morphological derivation of Amharic words from root(stem) words. After index changing of the tri-radical verbs, the index changed words will be searched in the Amharic word list file. If the index changed words are found directly or part of word with prefix and/or suffix, the pattern of the word with respect to the root verb and index changed words will be captured. From the pattern captured morphemes are extracted and rules are identified. 85,115 unique rules are identified. While identifying rules, the frequency of every rule is recorded in order to evaluate the efficiency of each rule. A memory-based machine learning approach applied to evaluate the frequency of the rules. From the 85,115 rules, the prefix of 29,776 rules and the suffix of 32,401 rules are wrong, and 11,390 rules are discarded by wrong index changing process. The rules identified showed the accuracy of 0.99, average precision of 0.88 and average recall of 0.85. Based on these rules, a comprehensive set of derivatives for tri-radical Amharic verbs were generated and end up having a rich Amharic Lexicon.
  • Item
    Automated Construction of a New Dataset for Histopathological Breast Cancer Images
    (Addis Ababa University, 2024-01) Kalkidan Kebede; Fitsum Assamnew (PhD)
    Cancer is a medical condition where cells grow uncontrollably and can spread to other parts of the body, posing a significant global health challenge. Among women worldwide, breast cancer is the most frequently diagnosed cancer and the leading cause of cancerrelated deaths. Automated classification of breast cancer has been extensively studied, particularly in differentiating types, subtypes, and stages. However, simultaneous classification of subtypes with stages, such as Lobular Carcinoma In Situ (LCIS) and Invasive Lobular Carcinoma (ILC), remains challenging due to limited data availability. This research aims to address this gap by generating a new dataset that includes these unclassified subtypes with staging, utilizing existing datasets as primary sources. Labels for ductal and lobular carcinoma from the BreakHis dataset and invasive and in situ carcinoma labels from the Yan et al. dataset are used to train models for generating the new dataset. To achieve this, two separate ensemble models are trained using distinct datasets. The first ensemble model classifies ductal and lobular carcinoma using the BreakHis dataset. The second ensemble model classifies invasive and in situ carcinoma using the Yan et al. dataset. Both models are then used to extract a new dataset through soft voting techniques. The extracted labels include Ductal Carcinoma In Situ (DCIS), Invasive Ductal Carcinoma (IDC), LCIS, and ILC. This approach aims to provide a more comprehensive classification system by leveraging labels from both datasets. To validate the newly extracted labels, three pathologists were given randomly extracted images from the Yan et al. dataset test set. The pathologists agreed with the model outputs on 87.5% of the samples. Subsequently, the newly generated dataset was used to classify DCIS, IDC, LCIS, and ILC with an accuracy of 76.06%.
  • Item
    Spectrum Occupancy Prediction Using Deep Learning Algorithms
    (Addis Ababa University, 2024-07) Addisu Melkie; Getachew Alemu (Phd)
    The fixed spectrum allocation (FSA) policy causes a waste of valuable and limited natural resources because a significant portion of the spectrum allocated to users is unused. With the exponential growth of wireless devices and the continuous development of new technologies demanding more bandwidth, there is a significant spectrum shortage under current policies. Dynamic spectrum access (DSA) implemented in a cognitive radio network (CRN) is an emerging solution to meet the growing demand for spectrum that promises to improve spectrum utilization that enables secondary users (SUs) to utilize unused spectrum allocated to primary users (PUs). CRNs have capabilities for empowerment to spectrum sensing, decision-making, sharing, and mobility. Spectrum sharing gets spectrum usage patterns from spectrum occupancy prediction to determine the channel states as “idle” or “busy”. This study has addressed all the limitations of the previous studies by implementing a comprehensive approach that encompasses reliable spectrum sensing, potential candidate spectrum band identification, long-term adaptive prediction modeling, and quantification of improvements achieved in the prediction model. The Long-Short Term Memory (LSTM) Deep Learning (DL) model was proposed as a solution for this study to address the challenge of capturing temporal dynamics in sequential inputs. The LSTM model leverages a gating mechanism to regulate information flow within the network, allowing it to learn and model long-term temporal dependencies effectively. The dataset used for this study was obtained from a real-world spectrum measurement by employing the Cyclostationary Feature Detection (CFD) approaches in the GSM900 mobile network uplink band, spanning a frequency range of 902.5 to 915 MHz over five consecutive days. The dataset comprises a total of 225,000 data points. The five-day spectrum measurement data analysis yields an average spectrum utilization of 20.47%. The proposed model has predicted the spectrum occupancy state for 5 hours ahead in the future with an accuracy of 99.45% improved the spectrum utilization from 20.47% to 98.28% and reduced the sensing energy to 29.39% compared to real-time sensing.
  • Item
    DEACT Hardware Solution to Rowhammer Attacks
    (Addis Ababa University, 2024-05) Tesfamichael Gebregziabher; Mohammed Ismail (Prof.); Fitsum Assamnew (PhD)
    Dynamic Random-Access Memory (DRAM) technology has advanced significantly, resulting in faster access times and increased storage capacities by shrinking the size of memory cells and tightly packing them on a chip. However, as the scaling of DRAM continues, it presents new challenges and considerations that need to be addressed. Smaller memory cells and the proximity between them have led to circuit disturbance errors, such as the Row-hammer problem. These errors can be exploited by attackers to induce bit flips and gain unauthorized access to systems, posing a significant security threat. In this research, we propose DEACT, a counter-based hardware mitigation approach designed to tackle the Row-hammer problem in DRAM. It moves all frequently accessed rows to a safety sub-array. DEACT aims to prevent further row activations and maintain hot rows, effectively eliminating the vulnerability. Furthermore, our counter implementation requires smaller chip area compared to existing solutions. Moreover, We introduce DDRSHARP, a cycle-accurate DRAM simulator that simplifies configuration and evaluation of various DRAM standards. DDRSHARP provides over 1.8x simulation time reduction compared to contemporary simulators. Its performance is optimized by avoiding infeasible iterations, minimizing branch instructions, caching repetitive calculations and other optimizations.
  • Item
    Addressing User Cold Start Problem in Amharic YouTube Advertisement Recommendation Using BERT
    (Addis Ababa University, 2024-06) Firehiwot Kebede; Fitsum Assamnew (PhD)
    With the rapid growth of the internet and smart mobile devices, online advertising has become widely accepted across various social media platforms. These platforms employ recommendation systems to personalize advertisements for individual users. However, a significant challenge for these systems is the user cold-start problem, where recommending items to new users is difficult due to the lack of historical preference of the user in a content-based recommendation system. To address this issue we propose an Amharic YouTube advertisement recommendation system for unsigned YouTube users where there is no user information like past preference or personal information. The proposed system uses content-based filtering techniques and leverages Sentence Bidirectional Encoder Representations from Transformers (SBERT) to establish sentence semantic similarity between YouTube video titles, descriptions, and advertisement titles. For this research, 4500 data were collected and preprocessed from YouTube via YouTube API, and 500 advertisement titles from advertising and promotional companies. Random samples from these datasets were annotated for evaluation purposes. Our proposed approach achieved a 70% accuracy in recommending semantically related Amharic Advertisements (Ads) to corresponding YouTube videos with respect to the annotated data. At a 95% confidence interval, our system demonstrated an accuracy of 58% to 76% in recommending Ads which are relevant to new users who have no prior interaction history on the platform with the Ads. This approach significantly enhances privacy by reducing the need for extensive data sharing.
  • Item
    Multimodal Amharic Fake News Detection using CNN-BiLSTM
    (Addis Ababa University, 2024-06) Mekdim Tessema; Fitsum Assamnew
    With the growth of internet accessibility social media users increased rapidly in Ethiopia. This created an easy ground for transmission of information between people. On the flip side it became a hub for fake news fabrication and propagation. Fake news that is available online has the potential to cause significant issues for both individuals and society as a whole. We propose a multimodal fake news detection for Amharic on social media that combines textual and visual features. Genuine and fake news data was collected from social media to create multimodal Amharic news dataset. The collected data was preprocessed to retrieve textual and visual features using Bidirectional Long Short Term Memory (BiLSTM) and Conventional Neural Network (CNN) respectively. Then the two sets of features were concatenated and were used to train our multimodal fake news detection model. Our proposed method achieved a 90% accuracy, 94% Precision. Compared to the state of the art unimodal fake news detection for Amharic, our proposed model achieved 4% accuracy and 7% precision increase in fake news detection performance.
  • Item
    Anomaly-Augmented Deep Learning for Adaptive Fraud Detection in Mobile Money Transactions
    (2024-06) Melat Kebede; Bisrat Derebssa (PhD)
    Mobile Money, a revolutionary technology, enables individuals to manage their bank accounts entirely via their mobile devices, allowing for transactions like bill payments with unmatched ease and efficiency.This innovation has significantly reshaped financial landscapes, particularly in developing countries with limited access to traditional banking, by promoting financial inclusion and driving economic opportunity. However, the rapid growth of mobile money services has introduced significant challenges, such as fraud, where unauthorized individuals manipulate the system through various scams, creating serious risks that lead to financial losses and undermining trust in the system. We propose a fraud detection model that integrates deep learning techniques to identify fraudulent transactions and adapt to the dynamic behaviors of fraudsters in mobile money transactions. Given the private nature of financial data, we utilized a synthetic dataset generated using the Pay Sim simulator, which is based on a company in Africa. We evaluated three deep learning architectures, namely Restricted Boltzmann Machine (RBM), Probabilistic Neural Network (PNN), and Multi-Layer Perceptron (MLP) for fraud detection, emphasizing feature engineering and class distribution. The MLP achieved 95.70% accuracy, outperforming the RBM (89.91%) and PNN (73.36%) across various class ratios and on both the original and feature-engineered datasets. Among various techniques for anomaly detection, the Auto-Encoder consistently outperformed others, such as the Isolation Forest and Local Outlier Factor, achieving an accuracy of 82.85%. Our hybrid model employed a feature augmentation approach, integrating prediction scores from an Autoencoder model as additional features. These scores were then fed into the Multi-layer Perceptron (MLP) model along with the original dataset. This hybrid approach achieved 96.56% accuracy, 97.62% precision, 84.16% recall, and a 90.39% F1-score, outperforming the standalone MLP.The Hybrid model achieved an accuracy of 73.33% on unseen dataset, showing a 3.9% increase over the MLP model’s 69.41% accuracy, and demonstrating its enhanced ability to capture and adapt to evolving fraud patterns.This study finds that the hybrid model’s enhanced performance highlights the significance of anomaly detection and feature engineering in improving fraud detection.
  • Item
    Training Stability of Multi-modal Unsupervised Image-to-Image Translation for Low Image Resolution Quality
    (Addis Ababa University, 2023-05) Yonas Desta; Bisrat Derbesa (PhD)
    The ultimate objective of the unsupervised image-to-image translation is to find the relationship between two distinct visual domains. The major drawback of this task is several alternative outputs from a single input image. In a Multi-modal unsupervised image-to-image translation model, There exist common latent space representations across images from many domains. The model showed one-to-many mapping and its ability to produce several outputs from a particular image source. One of the challenges with the Multi-modal Unsupervised Image-to-Image Translation model is training instability, which occurs when the model is training using a data set with low-quality images, such as 128x128. During the training instability, the generator loss reduces slowly because the generator is too hard trying to find a new equilibrium. To address this limitation, We propose spectral normalization as a method for weight normalization, which would limit the fitting ability of the network to stabilize the training of the discriminator in networks. The Lipschitz constant was a single hyperparameter that was adjusted. Our experiments used two different datasets. The first dataset contains 5000 images, and we conducted two separate experiments using data set with 5 and 10 epochs. In 5 epochs, our proposed method has achieved overall training loss generator losses reduced by 5.049 % on average and discriminator losses reduced by 2.882 % on average. In addition, in 10 epochs, total training loss generator losses of 5.032% and discriminator losses of 2.864% decreased on average. The second data-set contains 20000 images, and we used datasets with 5 and 10 epochs in two different experiments. Over 5 epochs, our proposed method reduced overall training loss generator losses by 4.745 % on average and discriminator losses by 2.787 % on average. Furthermore, in 10 epochs, the average total training loss was reduced, with generator losses of 3.092 % and discriminator losses of 2.497%. In addition, During the transition, our approach produces output images that are more realistic than multi modal unsupervised imageto- image translation.
  • Item
    Amharic Hateful Memes Detection on Social Media
    (Addis Ababa University, 2024-02) Abebe Goshime; Yalemzewud Negash (PhD)
    Hateful meme is defined as any expression that disparages an individual or a group on the basis of characteristics like race, ethnicity, gender, sexual orientation, country, religion, or other characteristics. It has grown to be a significant issue for all social media platforms. Ethiopia’s government has increasingly relied on the temporary closure of social media sites but such kind of activity couldn’t be permanent solution so design automatic system. These days, there are plenty of ways to communicate and make conversation in chat spaces and on social media such as , text, image, audio, text with image, and image with audio information. Memes are new and exponentially growing trend of data on social media, that blend words and images to convey ideas. The audience can become dubious if one of them is absent. Previous research on the identification of hate speech in Amharic has been primarily focused on textual content. We should design deep learning modal which automatically filter hateful memes in order to reduce hate content on social media. The basis of our model consists of two fundamental components. one is for textual features and the other is for visual features. For textual features, we need to extract text from memes using optical character recognition (OCR). The extracted text through the OCR system is pixel-wise, and the morphological complex nature of Amharic language will affect the performance of the system to extract incomplete or misspelled words. This could result in the limited detection of hateful memes. In order to work effectively with an OCR extracted text, we employed a word embedding method that can capture the syntactic and semantic meaning of a word. LSTM is used for learning long-distance dependency between word sequence in short texts. The visual data was encoded using an ImageNet-trained VGG-16 convolutional neural network. In the studies, the input for the Amharic hateful meme detection classifier combines textual and visual data. The maximum precision was 80.01 percent. When compared to state-of-the-art approaches using memes as a feature on CNN-LSTM, an average F-score improvement of 2.9% was attained.
  • Item
    Impact of Normalization and Informal Opinionated Features on Amharic Sentiment Analysis
    (Addis Ababa University, 2024-01) Abebaw Zewudu; Getachew Alemu (PhD)
    Sentiment analysis is the computational study of people’s ideas, attitudes, and feelings concerning an object via social media networks. To analyze the sentiment of these textual contents, previous study relied on formal lexicon and emoji with semantic and syntactic information as a feature. However, informal language is now being used to express opinions the majority of the time. It is challenging to create embedding features from unlabeled Amharic text files due to morphological difficulties of the informal and unstructured nature of Amharic informal texts. Despite the fact that normalization algorithms have been developed to convert informal language into its standard form, their impact on tasks such as sentiment analysis remains unknown. To address the challenge of Amharic sentiment analysis, we apply state-of-the-art solutions to problems, such as utilizing normalization and embedding Amharic informal text contains opinionated with lowered word frequency parameters as automatic features on CNN-Bi-LSTM approaches. Using a combination of word and character n-gram embedding, potential information is generated as word vectors from unlabeled Amharic informal text files. In the studies, the maximum recall was 91.67 percent. When compared to state-of-the-art approaches using formal lexicon and emoji as a feature on Bi-LSTM, an average recall improvement of 2.8 was attained. According to the results, labeling with a mix of informal, formal lexicons, and emoji achieves 1.9 better accuracy than labeling with just formal lexicons and emoji.
  • Item
    Machine Learning Approach for Morphological Analysis of Tigrigna Verbs
    (Addis Ababa University, 2018-10) Gebrearegay Kalayu; Getachew Alemu (PhD)
    Morphology, in linguistics, is the study of the forms of words that deals with the internal structure of words and word formation. Morphological analysis is the basic task of natural language processing that is defined as the process of segmenting words into morphemes and analyzing the word formation. It is often an initial step for various types of text analysis of any languages. Rule-based approach and machine learning approach are basic mechanisms for morphological analysis. The rule-based method is popular for the analysis but has limitations in terms of the efforts needed and the time. This is because the languages have many rules for a single word especially in the case of verbs. It is also difficult to include all words that need independent rules which limits the rule-based approach to accommodate words that are not in the database of the systems which can also affect the efficiency of the systems. In this work, a system for morphological analysis of Tigrigna language verbs is designed and implemented using machine learning approach. It is intended to automatically segment a given input verb into morphemes and give their categories based on prefix-stem-suffix segmentation. It gives the inflectional categories based on the subject and object markers of verbs that includes the gender, number and person by detecting the correct boundary of the morphemes. The negative, causative and passive prefixes are also considered. The data needed for training and testing was collected from scratch and annotated manually as the language is under-resourced. After the annotation process, an automatic method was implemented using java to preprocess the annotated verbs to produce list of instances for training and testing. The instance- based algorithm was used with the overlap metric with information gain weighting (IB1-IG) and without weighting (IB1) the features. Experiments were performed by varying the number of nearest neighbors starting from one up to seventeen where the accuracies were almost saturated for both the IB1 and IB1-IG. The majority class voting and the inverse distance weighted decision methods were also compared in the experiment. The best performance were obtained with IB1 using both decision methods when the number of nearest neighbors parameter was smaller. The performance decreased as the number of nearest neighbor increased for both decision methods but showed higher variation in the case of majority class voting. Similarly, the performance with IB1-IG was also better for the smaller number of nearest neighbor for both decision methods and decreased when the number of nearest neighbor increased where it showed higher decrement in the case of majority voting. The IB1 achieved better performance compared to the IB1-IG. A highest accuracy of 91.56% and 89.15% was achieved using IB1 and IB1-IG, respectively with the number of nearest neighbor parameter of 1 for IB1 and 2 for IB1-IG. This encouraging result revealed that the instance-based algorithm is able to automate the morphological analysis of Tigrigna verbs.
  • Item
    A Video Coding Scheme Based on Bit Depth Enhancement With CNN
    (Addis Ababa University, 2023-06) Daniel Getachew; Bisrat Derebssa (PhD)
    Raw or uncompressed videos take a lot of resources in terms of storage and bandwidth. Video compression algorithms are used to reduce the size of a video and many of them have been proposed over the years. People also proposed video coding schemes which works on top of existing video compression algorithms by applying down sampling prior to encoding and restoring them to their original form after decoding for further bitrate reduction. Down sampling can be done in spatial resolution or bit depth. This paper presents a new video coding scheme that is based on bit depth down sampling before encoding and use CNN to restore it at the decoder. However unlike previous approaches the proposed approach exploits the temporal correlation which exists between consecutive frames of a video sequence by dividing the frames into key frames and non-key frames and only apply bit depth down sampling to the non-key frames. These non-key frames will be reconstructed using a CNN that takes the key frames and non-key frames as input at the decoder. Experimental results showed that the proposed bit depth enhancement CNN model improved the quality of the restored non-key frames by an average of 1.6dB PSNR than the previous approach before integrated to the video coding scheme. When integrated in the video coding scheme the proposed approach achieved better coding gain with an average of -18.7454% in Bjøntegaard Delta measurements.
  • Item
    Amharic Speech Recognition System Using Joint Transformer and Connectionist Temporal Classification with External Language Model Integration
    (Addis Ababa University, 2023-06) Alemayehu Yilma; Bisrat Derebssa (PhD)
    Sequence-to-sequence (S2S) attention-based models are deep neural network models that have demonstrated some tremendously remarkable outcomes in automatic speech recognition (ASR) research. In these models, the cutting-edge Transformer architecture has been extensively employed to solve a variety of S2S transformation problems, such as machine translation and ASR. This architecture does not use sequential computation, which makes it different from recurrent neural networks (RNNs) and gives it the benefit of a rapid iteration rate during the training phase. However, according to the literature, the overall training speed (convergence) of Transformer is relatively slower than RNN-based ASR. Thus, to accelerate the convergence of the Transformer model, this research proposes joint Transformer and connectionist temporal classification (CTC) for Amharic speech recognition system. The research also investigates an appropriate recognition units: characters, subwords, and syllables for Amharic end-to-end speech recognition systems. In this study, the accuracy of character- and subword-based end-to-end speech recognition system is compared and contrasted for the target language. For the character-based model with character-level language model (LM), a best character error rate of 8.84% is reported, and for the subword-based model with subword-level LM, a best word error rate of 24.61% is reported. Furthermore, the syllable-based end-to-end model achieves a 7.05% phoneme error rate and a 13.3% syllable error rate without integrating any language models (LMs).
  • Item
    Explainable Rhythm-Based Heart Disease Detection from ECG Signals
    (Addis Ababa University, 2023-06) Dereje Degeffa; Fitsum Assamnew (PhD)
    Healthcare decision support systems must function with confidence, trust, and a functional understanding. Many researches have been done to automate the identification and classification of cardiovascular conditions from Electrocardiogram (ECG) signals. One such area of research is the use of Deep Learning (DL) for classification of ECG signals. However, DL models do not provide the information on why they reached their final decision. This makes it difficult to trust their output in a medical environment. In order to resolve the trust issue, there is research being done to explain the decision the DL model has arrived at. Some approaches have been used to improve the interpretability of DL models, using the Shapley Value SHAP technique. However, SHAP’s explanation happens to be computationally expensive. In this research, we develop a deep learning model that can detect five rhythm-based heart diseases that incorporate explainability. We employ visual explainers: Grad-CAM and Grad-CAM++; as an explainability framework. These explainers are relatively lightweight and can be executed quickly on a standard CPU or GPU. Our model was trained using 12- leads ECG signals from the PTB-XL large dataset. We used 3,229 ECG records to train the model, 404 ECG records to validate it, and 403 ECG records to test it. Our model was effective, with a classification evaluation accuracy of 0.96 and an F1 of 0.88. In order to evaluate the explainability, we gave ten randomly selected outputs to two domain experts. The two experts agreed with at least 80% of the explanations given to them. In the explanations that were not completely accepted by the experts, many of the leads out of the 12 were correctly explained. Showing that the use of visual explainability like Grad-CAM++ could be useful in the diagnosis process of heart diseases. The outcomes of this evaluation suggest that our model output is, on average, on the ten sample cases, 80% correct and consistent with the evaluation of the two experts.
  • Item
    Comparative Study of Machine Learning Algorithms for Smishing SMS Detection Model from CDR Dataset
    (Addis Ababa University, 2023-05) Samson Akele; Yalemzewd Negash (PhD)
    Phishing is becoming a significant threat to online security, and it spreads through a variety of channels like email and SMS or even a phone calls to gather crucial profile data about the victims. Although numerous anti-phishing measures have been created to halt the spread of phishing, it remains an unresolved issue. Smishing is a phishing attack that uses a mobile device's Short Messaging Service (SMS) to obtain the victim’s credentials. Employing an automated detection system will help improve identification and stop it before affecting targeted companies and third parties to alleviate this critical problem. A Smishing SMS detection-based CDR data framework is important to early monitoring experts and service providers in screening this kind of phishing attack, provides more accuracy, automates detection time, and keeps safe individuals. Many mobile phone users have been victimized yearly due to mistakenly interpreting the lures. Developing Accurate Smishing detecting system is helpful for organizations and related third parties who are highly affected due to smishing. This paper compares machine learning algorithms for the smishing SMS detection model. In this thesis, six supervised machine learning algorithm classifiers K-nearest Neighbor (KNN), Support vector machine (SMV), decision tree (DT), Naive Bayer (NB), Random Forest (RF), and logistic regression (LR) are compared for the performance of detecting Smishing SMS which is more recommended by scholars and the result obtained prove that these algorithms are much efficient in detecting Smishing problem. 10-fold cross-validation based on correlation algorithms is used for classification and implementation. The research collected Call Detail record CDRs data, and 33 distinct features were extracted initially, relevant features were selected, and eliminated unnecessary and irrelevant information, and different preprocessing methods, such as feature selection, and shaping the data were performed For the purpose of conducting this study. As a result, the RF algorithm with options for Cross Validation (CV), which scored 90.1% accuracy, is determined to be the best classifier algorithm, the two algorithms come next with the best result, KNN and DT, which scored 89.6% and, 88.8%, respectively, Using cross-validation, the SVM algorithm performs inaccurately and exceeds the desired detection delay by more than an hour during training Time. This outcome is the result of the RF algorithm's superior capacity to accurately handle vast amounts of data, form decision trees at random, and prevent overfitting by employing random subsets of characteristics to create smaller trees.
  • Item
    Performance Analysis of POCO Framework Under Failure Scenario in SDN-Enabled Controller Placement
    (Addis Ababa University, 2023-02) Sefinew Getnet; Yalemzewd Negash (PhD)
    One of the main causes of the demise of deep-rooted networks has been the explosive expansion of internet service providers and current data traffic. Nowadays, the most recent advancements in networking is the introduction of software-defined networking (SDN), which emerged for rebuilding and modifiable features. The core idea of SDN is the separation of the control and data planes. When controlled by an SDN controller, SDN offers advantages in terms of flexibility, manageability, and efficiency in contrast to traditional networks. This thesis work offered the Performance analysis of POCO Framework under Failure Scenario to address the challenges like rigidity, uncontrollability, and inadequately used network resources for the adoption of SDN networks. Furthermore, the network dataset of AAU was encoded, examined and the selected platform for determining the optimal number of controllers and their ideal location in the Metropolitan Area Network (MAN) were applied with the goal of minimizing latency and cost of controllers. The optimization is done in two scenarios, (I) controller placement using failure free and (II) when either nodes or controllers fail, using the Pareto Optimal Controller Optimization Placement (POCO) framework. First, the controller’s placement in the aforementioned scenarios for different numbers of K controllers was found and investigated. Following this, results of altered metrics like latencies of controller-to-controller(C2C) and nodeto- controller(N2C). fault tolerance (for either node or controller failures), and controller imbalance for both the aforesaid scenarios were conducted accordingly. From the results, it is recommended that the minimum number of controllers required to adopt SDN in AAU-MAN be two controllers, considering the resiliency of controllers placed at Sidist kilo and Arat kilo. Also, we conclude that the POCO performance analysis under the failure-free scenario-based experiment showed minimum N2C latency compared to the worst-case scenario (node failure and up to K-1 controller failure). But showed minimum node failure C2C latency compared to failure free scenario.
  • Item
    Radiation Tolerant Power Converter Design for Space Applications
    (Addis Ababa University, 2022-07) Solomon Mamo; Leroux Paul (Prof); Getachew Bekele (PhD); Valentijn De Smedt (Prof.)
    Radiation and extreme temperature are the main inhibitors for the use of electronic devices in space applications. Radiation challenges the normal and stable operation of power converters, used as power supply for onboard systems in satellites and spacecrafts. In this circumstance, special design approaches known as radiation hardening or radiation tolerant designs are employed. FPGAs are beneficial for developing low-cost, high-speed embedded digital controllers for power converters, but their components are highly susceptible to radiation-induced faults. In safety and mission-critical systems, like space systems, radiation-induced faults are a major concern. Majority of commercial off-the-shelf (COTS) FPGAs are not developed to function in high radiation environments, with the exception of a handful of circuits that are radiation–hardened at the manufacturing process level at a very high cost overhead, making them less appealing from a performance and economic standpoint. Design-based techniques are another option for reaching the necessary level of reliability in a system design. This work investigates and designs a novel FPGA-based radiation-tolerant digital controller for DC-DC converters, with applications in space. The controller's radiation-induced failure modes were analyzed in order to develop a mitigation strategy, which included identifying the error modes and determining how existing mitigation approaches could be improved. For FPGA implementation and optimization of the radiation tolerant digital controller, a model-based design approach is presented. To validate the recommended solution strategies, fault injection campaigns are employed.