Enhanced Robustness for Speech Emotion Recognition: Combining Acoustic and Linguistic Information

dc.contributor.advisorMidekso, Dida (PhD)
dc.contributor.authorSinishaw, Hana
dc.date.accessioned2019-05-08T06:58:39Z
dc.date.accessioned2023-11-29T04:06:03Z
dc.date.available2019-05-08T06:58:39Z
dc.date.available2023-11-29T04:06:03Z
dc.date.issued2017-10-05
dc.description.abstractAffective computing is the area of Artificial Intillegence studies which focuses on the design and development of intelligent devices which can perceive, process and synthesize human emotions. Humans can interpret emotions in a number of different ways, for example, processing spoken utterances, non-verbal cues, facial expressions and also written communication. Changes in our nervous system indirectly alter spoken utterances which makes it possible for people to perceive how others feel by listening to them speak. These changes can also be interpreted by machines through the extraction of speech features. The field of speech emotion recognition (SER) takes advantage of this capability and has subsequently offered many approaches to recognize affect in spoken utterances. The majority of state of the art SER systems employ complex statistical algorithms to model the relationship between acoustic parameters extracted from spoken language. Studies also show that phrases, word senses and syntactic relations that convey linguistic attributes of a language play an important role in enhancing the prediction rates. Our research focuses on this problem of recognizing affect in spoken utterances and offers a contribution to state of the art systems with linguistic knowledge to enhance its efficiency instead of relying only on speech utterances. In this work, speech emotion recognition system is developed for Amharic language based on acoustic and linguistic features. The classification performance is based on extracted features. We used a baseline set of 384 acoustic features and for linguistic analysis techniques from text we used key word spotting, negation handling and sentiment analysis with emotion generation rules. Combining those features, we achieved an accuracy of 64..2% in identifying Happiness, Surprise, Anger, Sadness, Fear, Disgust and Neutral emotions.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/18207
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectSpeech Emotion Recognitionen_US
dc.subjectAcoustic Featuresen_US
dc.subjectLinguistic Featuresen_US
dc.subjectFeature Extractionen_US
dc.subjectFeature Selectionen_US
dc.subjectClassificationen_US
dc.titleEnhanced Robustness for Speech Emotion Recognition: Combining Acoustic and Linguistic Informationen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Hana Sinishaw 2017.pdf
Size:
98.28 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: