A Model for Amharic Idiom Identification Using Deep Learning
No Thumbnail Available
Date
2025-07-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
This work explores the detection of Amharic idioms using a hybrid machine learning and deep learning approach. Amharic, aSemiticlanguagespoken in Ethiopia, has alargeidiomatic vocabulary, making it challenging to perform natural language processing tasks.
Weinvestigate the effectiveness of traditional machine learning algorithms, including Support Vector Machines (SVM), and Gradient Boosting, for idiom detection. Furthermore, we develop the potential of recurrent neural networks (RNNs), specifically Long Short-Term Memory (LSTM) and Bidirectional LSTM (Bi-LSTM),forcapturing the sequential natureoflanguageandenhancing idiom identification accuracy. Our research aims to contribute to advancing Amharic natural language processing by developing robust and efficient idiom identification models. Experimental results on a curated Amharic idiom dataset were presented, the performance of the different algorithms was compared, and their strengths and weaknesses were analyzed. To measure the model's performance, we used accuracy, precision, recall, and F- score. The experimental results from the idiom identification indicate that the combination of SVM with Bi-LSTM, gradient boosting with Bi-LSTM, and Bi-LSTM alone achieved accuracies of 98.27%, 98%, and 98.18%, respectively. This study provides insights into the suitability ofvarious machine-learning approaches forAmharicidiom identification and lays the groundwork for future research in this domain