Collaborative News Filtering for Amharic: An Experiment Using Neural Networks
No Thumbnail Available
Date
2005-07
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Information Filtering (IF) is an area of research where only a few documents are
selected from a large collection in a dynamic flow of information. Particularly, filtering out
news items from a collection has paramount importance in order that a news reader can
easily find what he/she likes to read. Several research projects are underway to
implement such a system.
Collaborative filtering aims at learning predictive models of user preferences, interests or
behavior from community data, e.g., a database of available user preferences. It is
complementary to content-based filtering and retrieval that is mainly built on the
fundamental assumption that users are able to formulate queries that express their
interests or information needs in terms of intrinsic features of the items sought.
Many newspapers are being published in Amharic. Almost none of them use automated
systems for filtering news. With the increasing number of such news, it is evident that a
lot of textual information is accumulated which makes it difficult for the reader to find few
desired news from a collection. It was felt that an experiment should be underway to
extract such news on the basis of collaborative interest.
The purpose of this research was, therefore, to explore the potential application of
Artificial Neural Networks (ANN) for filtering Amharic news based on preferences of
readers. The Back propagation (BPN) and Self Organizing Map (SOM) algorithms were
used to develop a model for Amharic news filtering where news items were selected
from two popular Amharic newspapers. The preferences of reading these items,
collected from active readers of the newspapers, were used to develop the first model
whereas the weighted term-by-document matrix of the news items in the sample was
used to classify the news items.
The experiment was undergone in twofold; developing a model for predicting user
preferences of reading news items and classifying the news items in the sample to
predefined categories. The results showed that ANNs can be used to model user
preferences of news items written in Amharic. The Mat lab neural network toolbox was
used to develop both models.
The result indicated that with Model 1, containing the preference list, it could predict
83.3% of the preferences in the training set and 79.8% of the preferences in the test set.
That is, a news item is likely to satisfy the readers in the test set 79.8% of the time.
Model 2, the Self-Organizing Map (SOM) model, was also trained so as to classify news
items into each category of the news. The best model could classify the items 76.5% of
the time. 72.9% of the news items in the test set were correctly classified into the
respective category. However, as neural networks learn from large examples, extended
research is recommended.
Description
Keywords
Information Science