Abebe Ermias (Ato)Kebede Gebeyehu2020-06-032023-11-182020-06-032023-11-182009-09http://etd.aau.edu.et/handle/12345678/21414Automatic understanding of natural languages requires a set of language processing tools. POS tagger, which assigns the proper part s of speech (like noun , verb, adjective, etc) to word s in a sentence, is one of these tool s. T h is stud y in vest gates the possibility of applying decision tree based POS tagger for Amharic . The tagger was developed us in g j48 decision tree c classifier algorithm , which is Weka's implementation ofC4.5 algorithm in the process, a corpus developed b y ELRC annotation team was used to get the required data for training and testing the model s . The datasets is comprised of 10 6 5 news documents ; 2 10 ,000 words. A sample o f some 800 sentences are selected and used for model development and evaluation . The datasets was processed in line with the requirements of the Weka's data mining tool. In order to support decision tree classification mode is, a table that contain s the contextual and orthographic information is constructed semi-automatically and used as training and testing datasets The right and left neighboring words tags for each word are used as contextual information. Moreover, orthographic information abut the word like the first and last character, the prefix and suffix, existence of rim e riding it within the word and so o n are included in the table to provide useful information to the word to be tagged. Performance tests we re conducted at various stages using 10-fold cross validation test option. Experimental results show that, only two successive left and rig ht words tag pro v id e useful contextual information; contextual information beyond t woodiest provide useful information rather noise. In the end , a n over all ,including ambiguous us and unknown word s, 84.9% correctness (or accuracy) was obtained us in g 10- fold cross validation test option. Even though , the accuracy of this stud y is encouraging further study to improve the accuracy so a s to reach at implementation level is recommended. .enInformation ScienceThe Application of Decislon Tree f or Part of Speech (Pos) T Agging for AmharicThesis