Computer Engineering

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 214
  • Item
    Training Stability of Multi-modal Unsupervised Image-to-Image Translation for Low Image Resolution Quality
    (Addis Ababa University, 2023-05) Yonas Desta; Bisrat Derbesa (PhD)
    The ultimate objective of the unsupervised image-to-image translation is to find the relationship between two distinct visual domains. The major drawback of this task is several alternative outputs from a single input image. In a Multi-modal unsupervised image-to-image translation model, There exist common latent space representations across images from many domains. The model showed one-to-many mapping and its ability to produce several outputs from a particular image source. One of the challenges with the Multi-modal Unsupervised Image-to-Image Translation model is training instability, which occurs when the model is training using a data set with low-quality images, such as 128x128. During the training instability, the generator loss reduces slowly because the generator is too hard trying to find a new equilibrium. To address this limitation, We propose spectral normalization as a method for weight normalization, which would limit the fitting ability of the network to stabilize the training of the discriminator in networks. The Lipschitz constant was a single hyperparameter that was adjusted. Our experiments used two different datasets. The first dataset contains 5000 images, and we conducted two separate experiments using data set with 5 and 10 epochs. In 5 epochs, our proposed method has achieved overall training loss generator losses reduced by 5.049 % on average and discriminator losses reduced by 2.882 % on average. In addition, in 10 epochs, total training loss generator losses of 5.032% and discriminator losses of 2.864% decreased on average. The second data-set contains 20000 images, and we used datasets with 5 and 10 epochs in two different experiments. Over 5 epochs, our proposed method reduced overall training loss generator losses by 4.745 % on average and discriminator losses by 2.787 % on average. Furthermore, in 10 epochs, the average total training loss was reduced, with generator losses of 3.092 % and discriminator losses of 2.497%. In addition, During the transition, our approach produces output images that are more realistic than multi modal unsupervised imageto- image translation.
  • Item
    Amharic Hateful Memes Detection on Social Media
    (Addis Ababa University, 2024-02) Abebe Goshime; Yalemzewud Negash (PhD)
    Hateful meme is defined as any expression that disparages an individual or a group on the basis of characteristics like race, ethnicity, gender, sexual orientation, country, religion, or other characteristics. It has grown to be a significant issue for all social media platforms. Ethiopia’s government has increasingly relied on the temporary closure of social media sites but such kind of activity couldn’t be permanent solution so design automatic system. These days, there are plenty of ways to communicate and make conversation in chat spaces and on social media such as , text, image, audio, text with image, and image with audio information. Memes are new and exponentially growing trend of data on social media, that blend words and images to convey ideas. The audience can become dubious if one of them is absent. Previous research on the identification of hate speech in Amharic has been primarily focused on textual content. We should design deep learning modal which automatically filter hateful memes in order to reduce hate content on social media. The basis of our model consists of two fundamental components. one is for textual features and the other is for visual features. For textual features, we need to extract text from memes using optical character recognition (OCR). The extracted text through the OCR system is pixel-wise, and the morphological complex nature of Amharic language will affect the performance of the system to extract incomplete or misspelled words. This could result in the limited detection of hateful memes. In order to work effectively with an OCR extracted text, we employed a word embedding method that can capture the syntactic and semantic meaning of a word. LSTM is used for learning long-distance dependency between word sequence in short texts. The visual data was encoded using an ImageNet-trained VGG-16 convolutional neural network. In the studies, the input for the Amharic hateful meme detection classifier combines textual and visual data. The maximum precision was 80.01 percent. When compared to state-of-the-art approaches using memes as a feature on CNN-LSTM, an average F-score improvement of 2.9% was attained.
  • Item
    Impact of Normalization and Informal Opinionated Features on Amharic Sentiment Analysis
    (Addis Ababa University, 2024-01) Abebaw Zewudu; Getachew Alemu (PhD)
    Sentiment analysis is the computational study of people’s ideas, attitudes, and feelings concerning an object via social media networks. To analyze the sentiment of these textual contents, previous study relied on formal lexicon and emoji with semantic and syntactic information as a feature. However, informal language is now being used to express opinions the majority of the time. It is challenging to create embedding features from unlabeled Amharic text files due to morphological difficulties of the informal and unstructured nature of Amharic informal texts. Despite the fact that normalization algorithms have been developed to convert informal language into its standard form, their impact on tasks such as sentiment analysis remains unknown. To address the challenge of Amharic sentiment analysis, we apply state-of-the-art solutions to problems, such as utilizing normalization and embedding Amharic informal text contains opinionated with lowered word frequency parameters as automatic features on CNN-Bi-LSTM approaches. Using a combination of word and character n-gram embedding, potential information is generated as word vectors from unlabeled Amharic informal text files. In the studies, the maximum recall was 91.67 percent. When compared to state-of-the-art approaches using formal lexicon and emoji as a feature on Bi-LSTM, an average recall improvement of 2.8 was attained. According to the results, labeling with a mix of informal, formal lexicons, and emoji achieves 1.9 better accuracy than labeling with just formal lexicons and emoji.
  • Item
    Machine Learning Approach for Morphological Analysis of Tigrigna Verbs
    (Addis Ababa University, 2018-10) Gebrearegay Kalayu; Getachew Alemu (PhD)
    Morphology, in linguistics, is the study of the forms of words that deals with the internal structure of words and word formation. Morphological analysis is the basic task of natural language processing that is defined as the process of segmenting words into morphemes and analyzing the word formation. It is often an initial step for various types of text analysis of any languages. Rule-based approach and machine learning approach are basic mechanisms for morphological analysis. The rule-based method is popular for the analysis but has limitations in terms of the efforts needed and the time. This is because the languages have many rules for a single word especially in the case of verbs. It is also difficult to include all words that need independent rules which limits the rule-based approach to accommodate words that are not in the database of the systems which can also affect the efficiency of the systems. In this work, a system for morphological analysis of Tigrigna language verbs is designed and implemented using machine learning approach. It is intended to automatically segment a given input verb into morphemes and give their categories based on prefix-stem-suffix segmentation. It gives the inflectional categories based on the subject and object markers of verbs that includes the gender, number and person by detecting the correct boundary of the morphemes. The negative, causative and passive prefixes are also considered. The data needed for training and testing was collected from scratch and annotated manually as the language is under-resourced. After the annotation process, an automatic method was implemented using java to preprocess the annotated verbs to produce list of instances for training and testing. The instance- based algorithm was used with the overlap metric with information gain weighting (IB1-IG) and without weighting (IB1) the features. Experiments were performed by varying the number of nearest neighbors starting from one up to seventeen where the accuracies were almost saturated for both the IB1 and IB1-IG. The majority class voting and the inverse distance weighted decision methods were also compared in the experiment. The best performance were obtained with IB1 using both decision methods when the number of nearest neighbors parameter was smaller. The performance decreased as the number of nearest neighbor increased for both decision methods but showed higher variation in the case of majority class voting. Similarly, the performance with IB1-IG was also better for the smaller number of nearest neighbor for both decision methods and decreased when the number of nearest neighbor increased where it showed higher decrement in the case of majority voting. The IB1 achieved better performance compared to the IB1-IG. A highest accuracy of 91.56% and 89.15% was achieved using IB1 and IB1-IG, respectively with the number of nearest neighbor parameter of 1 for IB1 and 2 for IB1-IG. This encouraging result revealed that the instance-based algorithm is able to automate the morphological analysis of Tigrigna verbs.
  • Item
    A Video Coding Scheme Based on Bit Depth Enhancement With CNN
    (Addis Ababa University, 2023-06) Daniel Getachew; Bisrat Derebssa (PhD)
    Raw or uncompressed videos take a lot of resources in terms of storage and bandwidth. Video compression algorithms are used to reduce the size of a video and many of them have been proposed over the years. People also proposed video coding schemes which works on top of existing video compression algorithms by applying down sampling prior to encoding and restoring them to their original form after decoding for further bitrate reduction. Down sampling can be done in spatial resolution or bit depth. This paper presents a new video coding scheme that is based on bit depth down sampling before encoding and use CNN to restore it at the decoder. However unlike previous approaches the proposed approach exploits the temporal correlation which exists between consecutive frames of a video sequence by dividing the frames into key frames and non-key frames and only apply bit depth down sampling to the non-key frames. These non-key frames will be reconstructed using a CNN that takes the key frames and non-key frames as input at the decoder. Experimental results showed that the proposed bit depth enhancement CNN model improved the quality of the restored non-key frames by an average of 1.6dB PSNR than the previous approach before integrated to the video coding scheme. When integrated in the video coding scheme the proposed approach achieved better coding gain with an average of -18.7454% in Bjøntegaard Delta measurements.
  • Item
    Amharic Speech Recognition System Using Joint Transformer and Connectionist Temporal Classification with External Language Model Integration
    (Addis Ababa University, 2023-06) Alemayehu Yilma; Bisrat Derebssa (PhD)
    Sequence-to-sequence (S2S) attention-based models are deep neural network models that have demonstrated some tremendously remarkable outcomes in automatic speech recognition (ASR) research. In these models, the cutting-edge Transformer architecture has been extensively employed to solve a variety of S2S transformation problems, such as machine translation and ASR. This architecture does not use sequential computation, which makes it different from recurrent neural networks (RNNs) and gives it the benefit of a rapid iteration rate during the training phase. However, according to the literature, the overall training speed (convergence) of Transformer is relatively slower than RNN-based ASR. Thus, to accelerate the convergence of the Transformer model, this research proposes joint Transformer and connectionist temporal classification (CTC) for Amharic speech recognition system. The research also investigates an appropriate recognition units: characters, subwords, and syllables for Amharic end-to-end speech recognition systems. In this study, the accuracy of character- and subword-based end-to-end speech recognition system is compared and contrasted for the target language. For the character-based model with character-level language model (LM), a best character error rate of 8.84% is reported, and for the subword-based model with subword-level LM, a best word error rate of 24.61% is reported. Furthermore, the syllable-based end-to-end model achieves a 7.05% phoneme error rate and a 13.3% syllable error rate without integrating any language models (LMs).
  • Item
    Explainable Rhythm-Based Heart Disease Detection from ECG Signals
    (Addis Ababa University, 2023-06) Dereje Degeffa; Fitsum Assamnew (PhD)
    Healthcare decision support systems must function with confidence, trust, and a functional understanding. Many researches have been done to automate the identification and classification of cardiovascular conditions from Electrocardiogram (ECG) signals. One such area of research is the use of Deep Learning (DL) for classification of ECG signals. However, DL models do not provide the information on why they reached their final decision. This makes it difficult to trust their output in a medical environment. In order to resolve the trust issue, there is research being done to explain the decision the DL model has arrived at. Some approaches have been used to improve the interpretability of DL models, using the Shapley Value SHAP technique. However, SHAP’s explanation happens to be computationally expensive. In this research, we develop a deep learning model that can detect five rhythm-based heart diseases that incorporate explainability. We employ visual explainers: Grad-CAM and Grad-CAM++; as an explainability framework. These explainers are relatively lightweight and can be executed quickly on a standard CPU or GPU. Our model was trained using 12- leads ECG signals from the PTB-XL large dataset. We used 3,229 ECG records to train the model, 404 ECG records to validate it, and 403 ECG records to test it. Our model was effective, with a classification evaluation accuracy of 0.96 and an F1 of 0.88. In order to evaluate the explainability, we gave ten randomly selected outputs to two domain experts. The two experts agreed with at least 80% of the explanations given to them. In the explanations that were not completely accepted by the experts, many of the leads out of the 12 were correctly explained. Showing that the use of visual explainability like Grad-CAM++ could be useful in the diagnosis process of heart diseases. The outcomes of this evaluation suggest that our model output is, on average, on the ten sample cases, 80% correct and consistent with the evaluation of the two experts.
  • Item
    Comparative Study of Machine Learning Algorithms for Smishing SMS Detection Model from CDR Dataset
    (Addis Ababa University, 2023-05) Samson Akele; Yalemzewd Negash (PhD)
    Phishing is becoming a significant threat to online security, and it spreads through a variety of channels like email and SMS or even a phone calls to gather crucial profile data about the victims. Although numerous anti-phishing measures have been created to halt the spread of phishing, it remains an unresolved issue. Smishing is a phishing attack that uses a mobile device's Short Messaging Service (SMS) to obtain the victim’s credentials. Employing an automated detection system will help improve identification and stop it before affecting targeted companies and third parties to alleviate this critical problem. A Smishing SMS detection-based CDR data framework is important to early monitoring experts and service providers in screening this kind of phishing attack, provides more accuracy, automates detection time, and keeps safe individuals. Many mobile phone users have been victimized yearly due to mistakenly interpreting the lures. Developing Accurate Smishing detecting system is helpful for organizations and related third parties who are highly affected due to smishing. This paper compares machine learning algorithms for the smishing SMS detection model. In this thesis, six supervised machine learning algorithm classifiers K-nearest Neighbor (KNN), Support vector machine (SMV), decision tree (DT), Naive Bayer (NB), Random Forest (RF), and logistic regression (LR) are compared for the performance of detecting Smishing SMS which is more recommended by scholars and the result obtained prove that these algorithms are much efficient in detecting Smishing problem. 10-fold cross-validation based on correlation algorithms is used for classification and implementation. The research collected Call Detail record CDRs data, and 33 distinct features were extracted initially, relevant features were selected, and eliminated unnecessary and irrelevant information, and different preprocessing methods, such as feature selection, and shaping the data were performed For the purpose of conducting this study. As a result, the RF algorithm with options for Cross Validation (CV), which scored 90.1% accuracy, is determined to be the best classifier algorithm, the two algorithms come next with the best result, KNN and DT, which scored 89.6% and, 88.8%, respectively, Using cross-validation, the SVM algorithm performs inaccurately and exceeds the desired detection delay by more than an hour during training Time. This outcome is the result of the RF algorithm's superior capacity to accurately handle vast amounts of data, form decision trees at random, and prevent overfitting by employing random subsets of characteristics to create smaller trees.
  • Item
    Performance Analysis of POCO Framework Under Failure Scenario in SDN-Enabled Controller Placement
    (Addis Ababa University, 2023-02) Sefinew Getnet; Yalemzewd Negash (PhD)
    One of the main causes of the demise of deep-rooted networks has been the explosive expansion of internet service providers and current data traffic. Nowadays, the most recent advancements in networking is the introduction of software-defined networking (SDN), which emerged for rebuilding and modifiable features. The core idea of SDN is the separation of the control and data planes. When controlled by an SDN controller, SDN offers advantages in terms of flexibility, manageability, and efficiency in contrast to traditional networks. This thesis work offered the Performance analysis of POCO Framework under Failure Scenario to address the challenges like rigidity, uncontrollability, and inadequately used network resources for the adoption of SDN networks. Furthermore, the network dataset of AAU was encoded, examined and the selected platform for determining the optimal number of controllers and their ideal location in the Metropolitan Area Network (MAN) were applied with the goal of minimizing latency and cost of controllers. The optimization is done in two scenarios, (I) controller placement using failure free and (II) when either nodes or controllers fail, using the Pareto Optimal Controller Optimization Placement (POCO) framework. First, the controller’s placement in the aforementioned scenarios for different numbers of K controllers was found and investigated. Following this, results of altered metrics like latencies of controller-to-controller(C2C) and nodeto- controller(N2C). fault tolerance (for either node or controller failures), and controller imbalance for both the aforesaid scenarios were conducted accordingly. From the results, it is recommended that the minimum number of controllers required to adopt SDN in AAU-MAN be two controllers, considering the resiliency of controllers placed at Sidist kilo and Arat kilo. Also, we conclude that the POCO performance analysis under the failure-free scenario-based experiment showed minimum N2C latency compared to the worst-case scenario (node failure and up to K-1 controller failure). But showed minimum node failure C2C latency compared to failure free scenario.
  • Item
    Radiation Tolerant Power Converter Design for Space Applications
    (Addis Ababa University, 2022-07) Solomon Mamo; Leroux Paul (Prof); Getachew Bekele (PhD); Valentijn De Smedt (Prof.)
    Radiation and extreme temperature are the main inhibitors for the use of electronic devices in space applications. Radiation challenges the normal and stable operation of power converters, used as power supply for onboard systems in satellites and spacecrafts. In this circumstance, special design approaches known as radiation hardening or radiation tolerant designs are employed. FPGAs are beneficial for developing low-cost, high-speed embedded digital controllers for power converters, but their components are highly susceptible to radiation-induced faults. In safety and mission-critical systems, like space systems, radiation-induced faults are a major concern. Majority of commercial off-the-shelf (COTS) FPGAs are not developed to function in high radiation environments, with the exception of a handful of circuits that are radiation–hardened at the manufacturing process level at a very high cost overhead, making them less appealing from a performance and economic standpoint. Design-based techniques are another option for reaching the necessary level of reliability in a system design. This work investigates and designs a novel FPGA-based radiation-tolerant digital controller for DC-DC converters, with applications in space. The controller's radiation-induced failure modes were analyzed in order to develop a mitigation strategy, which included identifying the error modes and determining how existing mitigation approaches could be improved. For FPGA implementation and optimization of the radiation tolerant digital controller, a model-based design approach is presented. To validate the recommended solution strategies, fault injection campaigns are employed.
  • Item
    Mitigation of Memory Errors on Commodity Workstations
    (Addis Ababa University, 2023-06) Yafet Philipos; Fitsum Assamnew (PhD); Salessawi Ferede (PhD)
    Bits stored in Dynamic Random Access Memory (DRAM) could flip at random instances for various reasons such as cosmic ray incidence, electrical noise, and temperature fluctuations. In order to handle these bit-flips, Error Correcting Code (ECC) is integrated in many DRAM modules where such DRAMs are referred to as ECC-DRAM. One commonly used algorithm in ECC to detect double bit-flips and correct single bit-flips is the Single Error Correction Double Error Detection (SECDED) algorithm. However, the SECDED is only available on ECC-DRAMs as such we implemented an optimized version of SECDED to make it suitable on non-ECC devices. On the other hand, in order to increase the number of bit-flip detection capabilities, we proposed a novel approach called hash-based software ECC which uses the hash functions. Hash functions provide robust means to ensure the integrity of data due to their deterministic nature and avalanche effect. After a bit flip is detected through our method, a brute-force approach is used to correct the flipped bit/bits. Our implementation of SECDED is up to 6x faster than the direct implementation of SECDED for 1KB of data. The proposed hash-based software ECC is able to detect any number of bit flips with an adjustable number of bit flip corrections. In this work, the hash-based software ECC is set to correct up to 3-bit flips though it can be tuned to correct any number of flips at a cost of performance overhead. We integrated our approach into an in-memory database and the overhead introduced was found to be less than 3% for bit-flip detection.
  • Item
    Improve HMC Based Graph Processing By Adding Compress/Decompress Unit
    (Addis Ababa University, 2022-05) Betelhem , Mengesha; Fitsum, Asmamaw (PhD)
    Graphs play an important role in various practical application areas from social science to machine learning. However, due to the irregular data access pattern of graph computation, there is a major challenge in graph processing. The emergence of the technology called Hybrid memory cube(HMC) has helped graph processing accelerators to overcome this issue. This hardware provides e cient bandwidth to the graph computation, however, the communication tra c between memory cubes limits the performance. To overcome this issue we proposed a new approach for HMCs based accelerators by adding a packet compression/ decompression unit. We used Message Fussion and Tesseract as our baseline system. In our approach, the data sent between the memory cubes will be compressed before being sent into the network. From the experimental result, the proposed approach showed 1.7x performance improvement on average over the baseline systems. In addition, the energy consumption by the transmission of the network is reduced by 47.28% over the baseline system and the compressor/decompressor unit takes 25% of the total area.
  • Item
    Vertically Segmented Target Zone based Audio Fingerprinting
    (Addis Ababa University, 2022-04) Mekonnen Jifar; Surafel, Lemma (PhD)
    An audio fingerprint is a set of perceptual features that uniquely identify an audio file. Audio fingerprinting has applications in broadcast monitoring, meta-data collection, royalty tracking, etc. Audio fingerprinting systems suffer a lot from noise, compression, and modifications present in the audio. Pitch shifting is one such audio modification. Common real-world scenarios where pitch-shifting occurs include radio broadcasts, DJ sets, and deliberate alterations. Since pitch-shifting scales the spectral content of the original audio, matching pitch-shifted query audio to its original unmodified version is challenging. This thesis work proposes a Shazam-based audio fingerprinting system resistant to pitch-shifting. The proposed approach uses CQT to transform the scaling effect of pitch-shifting into vertical translation. From the spectrogram generated by CQT, the proposed approach picks triple spectral peaks to encode pitch-shifting resistant fingerprint hashes. Vertically segmented target zones were employed to organize spectral peaks into triplets. By increasing the locality of the generated fingerprint hashes, vertically segmenting the spectrogram minimizes the effect of pitch-shifting. A fingerprint hashing scheme that leverages vertically segmented target zones is proposed. A total of 42,000 query audio and a reference database of 3000 freely available songs were used to evaluate the proposed approach as well as the chosen baseline works: Panako and Quad. The result collected shows that the proposed approach handles pitch-shift modifications from -11% to +12% except for modification values of -8, -3, +3, and +9 percent. Panako achieved to identify queries with -6% to +6% pitch shifts except for modification values of -3 and +3 percent. Quad, on the other hand, can handle -12% to +7% pitch shifts with no such drops. The proposed approach is also robust to linear speed modification from -6% to +12%, which is a significant improvement over Panako, which can only handle -4% to +8% modifications. Quad showed better robustness to linear speed modification by handling rates ranging from -16% to +12%. However, Quad took, on average, 3 times more time to query a single audio than the proposed approach. Moreover, the proposed approach shows robustness to common audio effects such as echo, tremolo, flanger, band-pass, and chorus while Quad suffered significant accuracy drop for chorus, flanger and tremolo.
  • Item
    Machine Learning-Based Contamination Detection in Water Distribution System
    (Addis Ababa University, 2020-06) Akalewold, Fikre; Getachew, Alemu (PhD)
    Water is a necessary component of all human activities. According to the United Nations World Water Assessment Program, every day, 2 million tons of sewage, manufacturing, and agricultural waste are discharged into the world's water. Due to population demands and dwindling clean water supplies as well as available water pollution management mechanisms, there is an urgent need to use computational methods to intelligently manage available water. To ensure the protection of drinking water, accurate detection of natural or deliberate pollution events in water delivery pipes is essential. Companies that have water must ensure that it is safe to drink. To resolve the global issue of rising water contamination, the design of water contamination detector models has monitored the security of water in pipelines when concentrations of water quality variables in the pipes surpass their maximum threshold is presented in this paper. This paper proposes artificial neural networks, specifically Convolutional Neural Networks, for automated water impurity, detection to refine the model must a picture of turbid water in the pipe is used to detect events. The algorithm of deep learning achieved 96.3 percent accuracy after extensive training with a dataset of 4220 images reflecting various levels of contamination. Besides that, the machine learning algorithm uses an efficient study of water turbidity and transparency levels to estimate the level of pollution in a specific sample of water. As the established model is combined with the current framework, it will provide a cost-effective way for the water company to obtain an estimate of water quality, alerting local and national governments to take action, and potentially saving millions of people throughout the world.
  • Item
    Correction of Distortion in Scene Text Recognition
    (Addis Ababa University, 2022-04) Temesgen, Mekonnen; Surafel, Lemma (PhD)
    Scene texts which are found in scene images contain valuable meanings. Scene text recognition is the process of converting text regions on the image into machine readable and editable symbols. Naturally, scene texts can appear in regular or irregular layout. In scene texts irregular text is widely used. Scene text with an irregular layout is difficult to recognize because of different forms of distortions. Correcting these distortions without losing any desired information is one of the major challenges in computer vision. Different approaches are proposed to solve the problem of distortion in scene text recognition. Based on their proposed techniques, these approaches can be categorized into four categories: Character Level Strong Supervision, Rectification, Multi Direction Encoding and Attention based approaches. The state of the art is the attention-based approach which predicts characters from scene text image features by using encoder and decoder with attention methods. The performance of attention-based approaches, however, is not good mainly because they are unable to extract detail image features. The approach underperforms particularly with long scene text sequences due to their inconsistent and decreasing encoder output utilization during each decoding time step. Also, it faces the problem of attention mismatch for severely distorted texts. To tackle the problem in attention-based encoder decoder approach, we proposed global attention based mechanism with Bi-LSTM decoder which can handle any type of text distortions implicitly. The proposed approach is trained with 6,000 regular and irregular scene text images randomly taken from publicly available SYN90K synthetic datasets. The dataset is widely used to train scene text recognizers. Preprocessing tasks which are image rescaling and noise removal are performed only for training purpose. The proposed approach is evaluated using 4 class of regular scene text image datasets and 3 class of irregular scene text image datasets. The proposed approach outperforms the state-of-theart approach by an average of 1.58% on regular scene text image datasets and by an average of 1.85% on irregular scene text image datasets. In addition, the incorporation of Bi-LSTM decoder in the proposed approach increases the recognition performance by an average of 5.24% for regular scene texts and by an average of 3.05% for irregular scene texts.
  • Item
    Forecasting Ethiopian Agricultural Commodity Price Using Time Series Features and Technical Indicators
    (Addis Ababa University, 2022-05) Sisay, Gebremedhin; Surafel, Lemma (PhD)
    Agricultural commodity price prediction helps the government, investors, and farmers to make informed decisions. Realizing the benefit, several researchers proposed different prediction models that use different features. However, most prediction models are affected by factors, such as data type (e.g., linear and nonlinear), seasonality of commodity items, weather conditions, commodity volatility features, and country economic factors. Among these factors, the most significant impediments to the accuracy of commodity price prediction are seasonality and trend pattern. To fill this gap, we propose a model that predicts commodity prices through the combination of time series features and technical indicators. The prediction model is built using four-machine learning algorithms: Artificial Neural Network, Extreme Learning Machine, Support Vector Machine, and Random Forest. To assess the impact of the proposed approach, we conducted two experiments using coffee and sesame datasets. The performance of the prediction models is assessed using the root mean square error (RMSE) and mean average error (MAE). The results show that the proposed approach improves agricultural commodity price prediction performance in all cases except MAE of sesame while using Extreme Learning Machine. Using Artificial Neural Network, Extreme Learning Machine, Support Vector Machine and Random Forest, the RMSE of price prediction is reduced by an average of 4.37, 4.42, 2.74, and 5.15, respectively. Finally, among the four machine learning algorithms used in the study, Artificial Neural Network is found to be the best algorithm for enhancing the performance of agricultural commodity price prediction. We also conclude from our experiment result that considering commodity properties such as periodicity, volatility, linearity, momentum, volume, and trend would improve the performance of agricultural commodity price prediction. To see which of the features contributed more to the improvement of agricultural commodity price prediction, we computed feature importance using Random forest algorithms. The result shows that: close, high, low, open, exponential moving average (EMA), double exponential moving average (DEMA), simple moving average (SMA), truehigh, truelow, trend, seasonality, relative strength index (RSI) are the most important features in sesame and coffee price prediction.
  • Item
    Lexicon-Stance Based Amharic Fake News Detection
    (Addis Ababa University, 2022-05) Ibrahim, Neji; Sosina, Mengistu
    Due to the noisy nature of social media content, and the rapid propagation of false information, the identification, and detection of fake news become a challenging problem. In recent years, several studies propose to use text representation techniques from contentbased approaches to automatically detect fake news on the social media. However, fake news has a distinct writing pattern, and attempting to capture its distinguishing features may help us improve detection rather than focusing solely on text representation. In this study, we propose to combine the stance-based features (page score, headline to article similarity, and headline to headline similarities) with lexicon-based features from text representation techniques to enhance the detection performance. To build the detection model, we used three machine learning algorithms: Logistic regression, Passive Aggressive and Decision tree. The proposed approach is evaluated using a newly collected Amharic fake news dataset from Facebook. Our experiment results show that the hybrid features (lexicon-stance) are capable of improving the previous lexicon-based detection results by 4.1% accuracy, 3% precision, 4% recall, and 4% F1-score. In addition the hybrid feature improves the area under curve from 0.982 to 0.995 by reducing the false positive rate by 4% and improved the true positive rate by 4.4%. Furthermore, we found that page score, out of the proposed stance features included, has contributed the most to the improvement of lexicon-based fake news detection.
  • Item
    Automated Lung Tuberculosis Detection Using Chest Radiograph Images CNN–RNN Approach
    (Addis Ababa University, 2022-03) Addis, Meshesha; Nune, Sreenivas (PhD)
    In the prevailing era, automated identification of diseases becomes a vital for medical technology due to a rapid increase of human population in different parts of the globe. A framework of automated diseases detection based image processing is important to assist radiologists and doctors in the diagnosis/screening of disease and provides more accurate, enhanced diagnosis time, and decrease the mortality rate. Lung tuberculosis has been a severe threat in the current time and it is spreading globally. In order to ameliorate this serious problem, employing an automated detection, identification and diagnosis system will be helpful to enhance the diagnosis speed of this disease and imped it from being spread globally. Many Lung tuberculosis patients in Low and Middle-Income countries die each year due to mistakenly interpret in diagnosis. Developing Accurate Computer-Aided Diagnosis system is helpful for doctors and radiologists to interpret Chest radiograph of a lung tuberculosis patients. Chest radiograph is the most widely used technical tool in medical diagnosis for identification of Lung tuberculosis. However, the interpretations of Chest radiograph might vary from one individual to another. Using correct and early diagnosis imaging technique, the survival rate of the patients with lung tuberculosis is significantly raised. The proposed method has four major components: preprocessing, lung segmentation, feature extraction and classification. In preprocessing, image quality is enhanced using Gaussian filter and adaptive histogram equalization techniques. Gaussian filter is done for noise avoidance and adaptive histogram equalization is done for image contrast. The output gained from this preprocessed image taken as an input and were performed by thresholding, morphological and Active counter model which used to focus on the lung region or regions of the gained results. The output from this lung segmentation integrated with feature extraction and classification by applying Xception and LSTM architecture. Xception deep convolutional neural network model is a very important model in our thesis to extract the feature of the whole input image (data). And finally LSTM outputs the decision that whether image is TB positive or TB negative. The performance of the proposed computer-assisted diagnostic system for lung TB detection achieves accuracy (86%), precision (90.35%), Recall (85.10%), F1-score (87.65%).
  • Item
    Word Level Amharic Sign Language Recognition using Deep Learning Algorithms
    (Addis Ababa University, 2022-02) Biruk, Mengiste; Getachew, Alemu (PhD)
    In Ethiopia, Deaf peoples are vastly increase in number. Sign language is a natural language mostly used by Deaf persons to communicate with each other. However, during communication there is a big challenge between Deaf and normal person. Deaf use sign for communication whereas normal person use speech/text for communication. We need efficient system to exchange sign to speech/text or speech/text to sign. This thesis work focus on development of word level Amharic sign language recognition, translates Amharic word sign into their corresponding Amharic text using deep learning approach. The input for the system is video frames of Amharic sign words and the final output of the system is Amharic text. The proposed system has three major components: preprocessing, feature extraction and classification. Two preprocessing steeps were used, cropping and RGB to Grayscale conversion. Feature extraction was done by using deep residual network (ResNet-34) and store in .csv format. Finally, classification was done by the same deep learning algorithms ResNet-34. The system is trained and tested using a dataset prepared for this thesis purpose only for all Amharic sign words. The performance of the model measured by four different matrices (precision, recall, F1 score and accuracy). The system classify 60 sign words and score overall accuracy of 95%. Therefore, the classification performance of ResNet-34 is very good.
  • Item
    Acceleration of Convolutional Neural Network Training using Field Programmable Gate Arrays
    (Addis Ababa University, 2022-01) Guta, Tesema; Fitsum, Assamnew (PhD)
    Convolutional neural networks (CNN) training often necessitates a considerable amount of computational resources. In recent years, several studies have proposed CNN inference and training accelerators, which the FPGAs have previously demonstrated good performance and energy efficiency. To speed processing, the CNN requires additional computational resources such as memory bandwidth, a FPGA plantform resource usage, time, and power consumption. As well as training the CNN needs large datasets and computational power, and they are constrained by the requirement for improved hardware acceleration to support scalability beyond existing data and model sizes. In this study, we propose a procedure for energy efficient CNN training in collaboration with an FPGA-based accelerator. We employed optimizations such as quantization, which is a common model compression technique, to speed up the CNN training process. Additionally, a gradient accumulation buffer is used to ensure maximum operating efficiency while maintaining gradient descent of the learning algorithm. Subsequently, to validate our design, we implemented the AlexNet and VGG16 models on an FPGA board and a laptop CPU and GPU. Consequently, our designs achieve 203.75 GOPS on Terasic DE1-SoC with the AlexNet model and 196.50 GOPS with the VGG16 model on Terasic DE-SoC. This, as far as we know, outperforms existing FPGA-based accelerators. Compared to the CPU and GPU, our design is 22.613X and 3.709X more energy efficient respectively.