The Application of Machine Learning Technique (Naïve Bayes) For Automatic Text Summarization [The Case of Amharic News Texts]

dc.contributor.advisorBiru, Tesfaye (PhD)
dc.contributor.authorAndargie, Teferi
dc.date.accessioned2020-06-23T07:37:21Z
dc.date.accessioned2023-11-18T12:46:18Z
dc.date.available2020-06-23T07:37:21Z
dc.date.available2023-11-18T12:46:18Z
dc.date.issued2005-06
dc.description.abstractThis study presents an approach to automatic summarization of Amharic news texts by extracting sentences in a give n document. The objective o f this study is to investigate the application of machine learning technique (naive Bayes method) to automatic summarization of Amharic news items. The focus is on how to use the naive Bayes classifier for automatic Amharic news text summarization to extra ct sentences , i. e. on how to train the na'ive Bayes to classify sentences from Amharic news texts First each sentence is represented by a set of p redefined features (attributes) (i .e. location of a sentence in a document, title words occurring in the sentence, and cue words occurring in the sentence) that Edmondson (1969) found as a good indicator in giving an optimum summary for scientific papers. In addition, the thematic words occurring in the sentence. Then the naive Bayes algorithm is used to train to classify sentences as "a summary" and "not - a summary" based on the feature vectors. For the purpose of this study 480 Amharic news articles is used . Evaluation of the result s of the experiments is done using 10-fold cross validation. Result of the experiment shows that the location feature gives the best result in the classification n o f sentences when using individual features. The results of different combinations of feature sets in which location feature is included shows better results than when location is not included. Based on the feature values estimated on the training program for the combination of all the features a prototype summarizer is developed which extracts sentences to a desired compression level.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/21808
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectInformation Scienceen_US
dc.titleThe Application of Machine Learning Technique (Naïve Bayes) For Automatic Text Summarization [The Case of Amharic News Texts]en_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Teferi Andargie.pdf
Size:
19.79 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: