Dependency-based Tigrinya Grammar Checker
No Thumbnail Available
Date
2021-03-04
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Grammar checking is the process of checking for grammatical correctness by verifying the syntax
and morphology of a sentence according to the used language. For languages such As English,
Arabic, Afaan Oromo, and Amharic, many efforts have been made to develop grammar checking
systems. Because natural languages differ in their morphology and grammar, it's difficult to apply
a grammar checker of one language to another. Although an attempt was made to develop a
grammar checker for Tigrinya, the grammar checker is unable to identify the relationship between
words in a sentence, parsing complex and compound sentences, and it produces possible sentence
structures with syntactically correct but semantically non-sense sentences. The use of phrasestructure
grammar notation for statistical and rule-based methods causes the majority of these
issues because it has a complicated representation but it allows a limited level of grammar analysis.
We propose that applying dependency-based grammar checking for the Tigrinya language will
have a significant role in overcoming the problems in the existing grammar checker. The system
is composed of a text preprocessing module, a language dependency model, a dependency
extraction module, and a grammar checking module. The text preprocessing module is in charge
of cleaning input text and format conversion, and it includes the tokenizer, part of speech tagger,
and morphological analyzer to do so. The dependency model of the language is a pre-trained model
to be used by the text preprocessing and dependency extraction modules. The dependency
extraction module parses for the root of the sentence (main verb), head-dependent pairs and their
corresponding relations with the use of the dependency parser inside it. Finally, the grammar
checking module contains relation extractor and agreement checker components to carry out the
grammatical relation extraction and the grammatical agreement checking tasks.
A test data set of 74 grammatically correct sentences and 48 grammatically incorrect sentences
was used to test the grammar checker system. The system is tested with a total of 122 sentences.
On the basis of the prediction results, the system is evaluated using some evaluation metrics. The
system has the best precision of 92.46%, accuracy of 92.09%, and recall of 61.21% according to
the evaluation results. We filled over half of the test dataset with grammatically incorrect
sentences, which caused the low recall score.
Description
Keywords
Dependency Grammar, Dependency Parser, Grammar Checker