Deep Learning-Based Amharic Keyword Extraction for Open-Source Intelligence Analysis

No Thumbnail Available

Date

2025-06

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa Univeristy

Abstract

In today's digital age, the problem of information overload has become a pressing concern, especially in the field of OSINT (Open-Source Intelligence). With vast amounts of data available on the internet, it is challenging to separate relevant and credible information from the noise. An OSINT approach involves gathering intelligence from publicly available sources. However, with the increasing volume and diversity of online content, it has become difficult to extract actionable intelligence from enormous amounts of data. Deep learning can help identify patterns in large amounts of data and automate decision-making processes. Despite these advances, a problem of information overload still exists. One approach to addressing this problem is to develop effective deep learning model to extract the relevant information. Leveraging both machine and deep learning algorithms with natural language processing (NLP) can help automatically classify and categorize information. The purpose of this study is to design deep learning model to extract intelligence from vast amount of Amharic dataset, aiming to design model for keyword extraction. Keyword extraction is the process of identifying important words or phrases that capture the essence of a given piece of text. This task is critical for many natural language processing applications, including document summarization, information retrieval, and search engine optimization. In recent years, deep learning algorithms have shown great promise in this field, largely due to their ability to learn from vast amounts of data and extract complex patterns. In this paper, we propose a novel keyword extraction approach based on deep learning methods. We will explore different algorithms, such as recurrent neural networks (RNNs) and transformer models, to learn the relevant features from the input text and predict the most salient keywords. We evaluate our proposed method on datasets containing Amharic content, and show that it outperforms state-of-the-art methods. Our results suggest that deep learning-based approaches have the potential to significantly improve keyword extraction accuracy and scalability in realworld application.

Description

Keywords

Amharic, Keyword Extraction, Deep Learning, OSINT, Bi-LSTM, BART, Natural Language Processing, Amharic Text Analysis, Information Retrieval

Citation