Deep Learning-Based Amharic Keyword Extraction for Open-Source Intelligence Analysis
No Thumbnail Available
Date
2025-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa Univeristy
Abstract
In today's digital age, the problem of information overload has become a pressing concern,
especially in the field of OSINT (Open-Source Intelligence). With vast amounts of data available
on the internet, it is challenging to separate relevant and credible information from the noise. An
OSINT approach involves gathering intelligence from publicly available sources. However, with
the increasing volume and diversity of online content, it has become difficult to extract actionable
intelligence from enormous amounts of data. Deep learning can help identify patterns in large
amounts of data and automate decision-making processes. Despite these advances, a problem of
information overload still exists.
One approach to addressing this problem is to develop effective deep learning model to extract the
relevant information. Leveraging both machine and deep learning algorithms with natural
language processing (NLP) can help automatically classify and categorize information. The
purpose of this study is to design deep learning model to extract intelligence from vast amount of
Amharic dataset, aiming to design model for keyword extraction.
Keyword extraction is the process of identifying important words or phrases that capture the
essence of a given piece of text. This task is critical for many natural language processing
applications, including document summarization, information retrieval, and search engine
optimization. In recent years, deep learning algorithms have shown great promise in this field,
largely due to their ability to learn from vast amounts of data and extract complex patterns.
In this paper, we propose a novel keyword extraction approach based on deep learning methods.
We will explore different algorithms, such as recurrent neural networks (RNNs) and transformer
models, to learn the relevant features from the input text and predict the most salient keywords.
We evaluate our proposed method on datasets containing Amharic content, and show that it
outperforms state-of-the-art methods. Our results suggest that deep learning-based approaches
have the potential to significantly improve keyword extraction accuracy and scalability in realworld
application.
Description
Keywords
Amharic, Keyword Extraction, Deep Learning, OSINT, Bi-LSTM, BART, Natural Language Processing, Amharic Text Analysis, Information Retrieval