AAU Institutional Repository

Entropy Estimation and Entropy Based Encoding of Written Afaan Oromo for its Efficient Digital Transmission and Storage

Show simple item record

dc.contributor.advisor Dereje, Hailemariam (PhD)
dc.contributor.author Kalkidan, Dejene
dc.date.accessioned 2022-01-26T09:26:59Z
dc.date.available 2022-01-26T09:26:59Z
dc.date.issued 2021-02
dc.identifier.uri http://etd.aau.edu.et/handle/123456789/29703
dc.description.abstract According to Ethiopian population census, Oromo Language is estimated to be spoken by 36.4% of the local population. Furthermore, in addition to the local population the language is spoken outside of Ethiopia, for instance in small portion of Kenya. Thus, taking this into account the language is estimated to be spoken by around fty million people. In addition to the spoken form, a considerable portion of the language's speaker are capable of understanding its written form known as Qubee. The introduction of Qubee, in the mid-nineties has opened doors for its utilization in modern day communication systems. Leaving this argument aside, in the eyes of information theory and communication channels both symbol utilization schemes are found to be ine cient. This is because, Latin or Amharic symbols are represented by ASCII8 and UTF 16 xed length encoding mechanisms poorly model written natural language. With the expected increasing demand of the language in telecom services in mind, in this thesis we mainly aim at estimating the Oromo Language Language's entropy. The estimation will set the optimum number of bits per symbol needed to e ciently trans- mit written Oromo Language in communication systems. To achieve our objective, we have modeled the sources, i.e., written Oromo Language, as Nth order Markovian chain random process. Based on the modeling scheme we have studied the distribution of symbols in ten literature written in Oromo Language. The study reveals the Language can be transmitted using 4.31 bits/symbol when modeled as rst order Markovian Chain source. Whereas, the zero crossing entropy of the source was estimated to be in average at N=19.5; which gave an entropy estimation of 0.85 bits/symbol with a re- dundancy of 89.36%. Additionally, we have conducted two entropy-based compression algorithms, namely, Hu man and Arithmetic coding, to test the validity of our estima- tion. The Hu man algorithm was able to compress our sample corpora in average from 42:17% �� 64:88% for N = 1 �� 5. These compression results con rm the results of our Nth order estimation of the Language's entropy by approaching their theoretical limits. en_US
dc.language.iso en_US en_US
dc.publisher Addis Ababa University en_US
dc.subject Entropy en_US
dc.subject Encoding en_US
dc.subject Oromo Language en_US
dc.subject Language en_US
dc.title Entropy Estimation and Entropy Based Encoding of Written Afaan Oromo for its Efficient Digital Transmission and Storage en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AAU-ETD


Browse

My Account