School of Information Technology and Engineering
Permanent URI for this collection
Browse
Browsing School of Information Technology and Engineering by Author "Fantahun Bogale (PhD)"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Advancing Amharic Text Summarization with a Tailored Parameter-Efficient Fine-Tuning Technique(Addis Ababa University, 2025-08) Dagim Melkie; Fantahun Bogale (PhD)While recent progress in Large Language Models (LLMs) has revolutionized the field of Natural Language Processing (NLP), applying these models to low-resource languages such as Amharic presents considerable difficulties. Key obstacles include the scarcity of available data and the intensive computational cost associated with conventional finetuning methods. To overcome these issues, this thesis introduces a specialized parameterefficient fine-tuning (PEFT) framework developed specifically for Amharic text summarization. This new framework combines a dynamic low-rank adaptation component (DyLoRA-Amharic) with an adaptive activation method (AdaptAmharic), which work together to improve the model’s flexibility and optimize its resource allocation during training. The methodology involves injecting these custom modules into the mT5-small encoder– decoder architecture, allowing dynamic adjustment of DyLoRA-Amharic ranks and AdaptAmharic activation levels based on gradient signals. A joint optimization objective incorporating regularization terms for both rank and activation was employed to manage model complexity and ensure training stability. Comparative experiments were conducted against standard PEFT LoRA and Houlsby Adapter baselines on a curated Amharic summarization dataset. Experimental results demonstrate that the proposed DyLoRA-Amharic and Adapt Amharic framework significantly out performs the baselines across ROUGE, BLEU, and BERT Score metrics, achieving the lowest evaluation loss. Specifically, it improved ROUGEL by 30.5% and BLEU by 52.4% over the strongest baseline. This superior performance validates the efficacy of a densely injected, dynamic, and regularized architecture, challenging the conventional emphasis on maximal sparsity in PEFT. While the framework utilizes a higher proportion of trainable parameters (13.42%) compared to the baselines, this trade-off is justified by the substantial performance gains. This research contributes to advancing PEFT methodologies for low-resource NLP, providing a robust and adaptable solution for Amharic text summarization. The findings lay a foundation for developing more efficient and effective LLMs for diverse and linguistically underrepresented communitiesItem Enhancing Neural Machine Translation Through Incorporation of Unsupervised Language Understanding and Generation Techniques: The Case of English-Afaan Oromo Translation(2024-05) Chala Bekabil; Fantahun Bogale (PhD)Breaking down language barriers is a paramount pursuit in the realm of Artificial Intelligence. Machine Translation (MT), a domain within Natural Language Processing (NLP), holds the potential to bridge linguistic gaps and foster global communication. Enhancing cross-cultural communication through MT will be realized only if we succeed in developing accurate and adaptable techniques which in turn demands adequate availability of linguistic resources. Unluckily, under-resourced languages face challenges due to limited linguistic resources and sparse parallel data. Previous studies tried to solve this problem by using monolingual pre-training techniques. However, such studies solely rely on either Language Understanding (LU) or Language Generation (LG) techniques resulting in skewed translation. This study aims to enhance translation outcomes beyond the capabilities of previous studies by marrying the concepts of LU and LG and hence boosting the quality of MT in both directions. Our proposed model, the BERT-GPT incorporated Transformer, combines SOTA language models, BERT and GPT, trained on monolingual data into the original Transformer model and demonstrates substantial improvements. Experimental results shows that translation quality leaps forward, as evidenced by a significant increase in the BLEU score reaching 42.09, from the baseline score of 35.75 for English to Afaan Oromo translation, and 44.51 from the baseline score of 40.35 for Afaan Oromo to English translation on test dataset. Notably, our model unveils a deep understanding of Afaan Oromo’s linguistic nuances, resulting in translations that are precise, contextually appropriate, and faithful to the original intent. By leveraging the power of unsupervised pre-training and incorporation of unsupervised LU and LG techniques to the transformer model, we pave the way for enhanced cross-cultural communication, advanced understanding and inclusivity in our interconnected world.