Designing English-Kambaatissa Bilingual Electronic Dictionary: Using Parallel Corpora
No Thumbnail Available
Date
2013-06
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
The main aim of this study is to design English-Kambaati ssata bi lingual electronic
dictionary using Engli sh-Kambaati ssata parallel corpora. Literature review has
been made on Kambaatissata phonology and morphosyntax. Then, based on the
knowledge on Kambaatissa ta morphology, the study adopted stat istical machine
translation approach. To do so, fBMModell that is a word aliglU11ent model and
widely used in working with parallel bilingual corpora and which implements
expect maxi mization algorithm has used.
In general , 1194 Kambaatissata and Engli sh sentences were used from parallel
raw text. The raw texts were collected from the English-Kambaatissata
Constitution of the Federal Democratic Ethiop ia ( 1995), a tra ining material for
primary school mother tongue teachers and a Bible story book of children. A
database of word alignment probabilities we re deve loped from the aligned
sentences. These probabil ities were used to select the translation of EnglishKambaatissata
word and Kambaati ssata-English. After having the translation
equiva lent of English and Kambaatissata, parts of speech and gloss of the English
tern1S from th e English WordNet were extracted.
The accuracy of the des igned prototype was tested using 273 or 20% of the
retrieved di ctionary terms. Based on the manual evaluation, the result shows that
61.5% of the translation was correct.
Key Words: Bilingua l electronic dictionary, para llel corpora, design,
alignment and enhancement
Description
Keywords
Bilingua l electronic dictionary, parallel corpora, design, alignmen, enhancement