Enhanced Robustness for Speech Emotion Recognition: Combining Acoustic and Linguistic Information

Sinishaw, Hana

Enhanced Robustness for Speech Emotion Recognition: Combining Acoustic and Linguistic Information

dc.contributor.advisor	Midekso, Dida (PhD)
dc.contributor.author	Sinishaw, Hana
dc.date.accessioned	2019-05-08T06:58:39Z
dc.date.accessioned	2023-11-29T04:06:03Z
dc.date.available	2019-05-08T06:58:39Z
dc.date.available	2023-11-29T04:06:03Z
dc.date.issued	2017-10-05
dc.description.abstract	Affective computing is the area of Artificial Intillegence studies which focuses on the design and development of intelligent devices which can perceive, process and synthesize human emotions. Humans can interpret emotions in a number of different ways, for example, processing spoken utterances, non-verbal cues, facial expressions and also written communication. Changes in our nervous system indirectly alter spoken utterances which makes it possible for people to perceive how others feel by listening to them speak. These changes can also be interpreted by machines through the extraction of speech features. The field of speech emotion recognition (SER) takes advantage of this capability and has subsequently offered many approaches to recognize affect in spoken utterances. The majority of state of the art SER systems employ complex statistical algorithms to model the relationship between acoustic parameters extracted from spoken language. Studies also show that phrases, word senses and syntactic relations that convey linguistic attributes of a language play an important role in enhancing the prediction rates. Our research focuses on this problem of recognizing affect in spoken utterances and offers a contribution to state of the art systems with linguistic knowledge to enhance its efficiency instead of relying only on speech utterances. In this work, speech emotion recognition system is developed for Amharic language based on acoustic and linguistic features. The classification performance is based on extracted features. We used a baseline set of 384 acoustic features and for linguistic analysis techniques from text we used key word spotting, negation handling and sentiment analysis with emotion generation rules. Combining those features, we achieved an accuracy of 64..2% in identifying Happiness, Surprise, Anger, Sadness, Fear, Disgust and Neutral emotions.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/18207
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Speech Emotion Recognition	en_US
dc.subject	Acoustic Features	en_US
dc.subject	Linguistic Features	en_US
dc.subject	Feature Extraction	en_US
dc.subject	Feature Selection	en_US
dc.subject	Classification	en_US
dc.title	Enhanced Robustness for Speech Emotion Recognition: Combining Acoustic and Linguistic Information	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hana Sinishaw 2017.pdf
Size:: 98.28 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Environmental Science