Dependency-based Tigrinya Grammar Checker

No Thumbnail Available

Date

2021-03-04

Journal Title

Journal ISSN

Volume Title

Publisher

Addis Ababa University

Abstract

Grammar checking is the process of checking for grammatical correctness by verifying the syntax and morphology of a sentence according to the used language. For languages such As English, Arabic, Afaan Oromo, and Amharic, many efforts have been made to develop grammar checking systems. Because natural languages differ in their morphology and grammar, it's difficult to apply a grammar checker of one language to another. Although an attempt was made to develop a grammar checker for Tigrinya, the grammar checker is unable to identify the relationship between words in a sentence, parsing complex and compound sentences, and it produces possible sentence structures with syntactically correct but semantically non-sense sentences. The use of phrasestructure grammar notation for statistical and rule-based methods causes the majority of these issues because it has a complicated representation but it allows a limited level of grammar analysis. We propose that applying dependency-based grammar checking for the Tigrinya language will have a significant role in overcoming the problems in the existing grammar checker. The system is composed of a text preprocessing module, a language dependency model, a dependency extraction module, and a grammar checking module. The text preprocessing module is in charge of cleaning input text and format conversion, and it includes the tokenizer, part of speech tagger, and morphological analyzer to do so. The dependency model of the language is a pre-trained model to be used by the text preprocessing and dependency extraction modules. The dependency extraction module parses for the root of the sentence (main verb), head-dependent pairs and their corresponding relations with the use of the dependency parser inside it. Finally, the grammar checking module contains relation extractor and agreement checker components to carry out the grammatical relation extraction and the grammatical agreement checking tasks. A test data set of 74 grammatically correct sentences and 48 grammatically incorrect sentences was used to test the grammar checker system. The system is tested with a total of 122 sentences. On the basis of the prediction results, the system is evaluated using some evaluation metrics. The system has the best precision of 92.46%, accuracy of 92.09%, and recall of 61.21% according to the evaluation results. We filled over half of the test dataset with grammatically incorrect sentences, which caused the low recall score.

Description

Keywords

Dependency Grammar, Dependency Parser, Grammar Checker

Citation