Morphology Based Spell Checker for Kafi Noonoo Language

No Thumbnail Available

Date

10/3/2018

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

There are a number of NLP tools that are used in processing texts and other human languages. Among these tools spell checker is one that check the validity of words in the document. Spell checker is NLP application that is needed for every word processing document that analyze the input text for misspelled words and then provides possible suggestions for misspelled word for making correction. Two class of error in spelling error check: non-word error and real-word error. Non-word error is an error word that is misspelt and have no meaning in that specific language. Real-word error is a word that have meaning in that specified language but semantically and syntactically incorrect. Real word error is difficult to detect and provide suggestions and it needs syntactic and semantic analysis of the text. Dictionary look up and N-gram analyses are the most common used spelling error detection approaches. Edit distance, noisy channel model, neural network, rule-based, N-gram, phonetic based techniques are applied to generate suggestions for error correction. In spell checking area, a lot of work has been done in English, Arabic and Asian languages. Kafi Noonoo is one of the language spoken in South West part of Ethiopia by Kaffecho people. It is morphological rich language. There is no available spell checker for Kafi Noonoo language to analyze text written using this language which we were work on it. This thesis work is aimed to design and implement a spell checker system for Kafi Noonoo language. The proposed architecture of spell checker contains four main components: tokenization, error detection, word suggestion and error correction and with two backend components. Dictionary look up approach and morphology based approaches are used to implement the spell checker for Kafi Noonoo language. The prototype of the system is developed to test and evaluate the functionality and performance of the spell checker system. To test and evaluate the system, we used 2743 unique words collected from different sources. To measure the accuracy of the spell checker system lexical recall, error recall and precision evaluation metrics were used. Based on these evaluation metrics we get promising result of 95.91% lexical recall, 100% error recall and 62.76% precision.

Description

Keywords

Error Detection, Error Correction, Spell Checker, Morphology, Non Word Error, Real Word Error, Kafi Noonoo, Edit Distance, Dictionary Lookup

Citation

Collections