A Morphosyntactic Tagset for the Annotation of Texts in Tigrinya

Woldemariam, Tsegay

A Morphosyntactic Tagset for the Annotation of Texts in Tigrinya

dc.contributor.advisor	Liyew, Zelalem (PhD)
dc.contributor.author	Woldemariam, Tsegay
dc.date.accessioned	2019-02-11T15:58:25Z
dc.date.accessioned	2023-11-08T04:33:42Z
dc.date.available	2019-02-11T15:58:25Z
dc.date.available	2023-11-08T04:33:42Z
dc.date.issued	2013-06
dc.description.abstract	The major purpose of this thesis is to identify and develop a morphosyntactic tagset for morphosyntactic annotation of texts in Tigrinya, the Ethio-Semitic language having about seven to nine million speakers in Ethiopia and Eritrea (CSA, 2007; CIA 2012; http://en.wikipedia.org/wiki/Tigrinya_language#cite_ref-2). In relation to what is researched, there is almost no Natural Language Processing (NLP) resource for Tigrinya. The researcher thinks that Tigrinya is lucky to start with a comprehensive morphosyntactic tagset development; because morphosyntactic tagset is the foundation for many NLP applications. We have examined the Morphosyntactic features of Tigrinya words and assign a tag that can be applicable for these words in Tigrinya texts. The thesis focuses only on the development of morphosyntactic tagset based on the morphological and morphosyntactic features of Tigrinya. As a result the developed morphosyntactic tagset for Tigrinya has 18 coarse-grained tags at the higher level, 105 fine-grained tags at the lower level, and even we can extend to more fine-grained features and we get 139 tags. We recommend for researchers to use the 105 tags for their applications, unless and otherwise they have a different purpose which needs the coarse-grained major category 18 tags or the very fine-grained 139 tags, even beyond. The uses and applications of morphosyntactic tagsets provide an important level of linguistic information to a document. It is useful as a preprocessing step of parsing and most of all it is useful to develop a POS tagger, which is the basis for many higher NLP applications. Students, researchers and professionals like computational linguists/computer scientists who are engaged in Natural Language Processing applications like speech recognition, text to speech, natural language parsing, information retrieval, lexicography and machine translation are the beneficiaries of this research.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/16351
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Ethio-Semitic language having about	en_US
dc.subject	seven to nine million speakers in Ethiopia	en_US
dc.title	A Morphosyntactic Tagset for the Annotation of Texts in Tigrinya	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Tsegay Woldemariam.pdf
Size:: 1.94 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Linguistics