Automatic Classification of Ethiopian Traditional Music Using Audio-Visual Features and Deep Learning

dc.contributor.advisorGetahun, Fekade (PhD)
dc.contributor.authorMulugeta, Selam
dc.date.accessioned2020-09-02T10:45:48Z
dc.date.accessioned2023-11-09T16:18:44Z
dc.date.available2020-09-02T10:45:48Z
dc.date.available2023-11-09T16:18:44Z
dc.date.issued2020-06-06
dc.description.abstractMusic bridges the gap between linguistic and cultural gap and helps connect people. Ethiopia is a country with more than 80 tribes each having their own unique musical sound and style of dance. Distinguishing one from another is not an easy task especially in the era of streaming where lots of music are recorder and released each day through the Internet. Machine learning and recently deep learning is a subfield of machine learning that came to tackle the problem of automating tedious classification tasks previously done by programmers manually crafting the classification rules. Deep learning algorithms automatically learn the classification rules by just looking at the data. In this work, we address the automatic classification of Ethiopian traditional music to their respective locality using audio-visual features. To achieve that we use a deep neural network architecture composed of both convolutional neural network (CNN) and recurrent neural network (RNN). This architecture has an audio feature extracting component, that is composed of a parallel deep CNN and RNN which takes mel-spectrogram of an audio signal as an input and a video feature extracting component. The video feature extracting component uses transfer learning to extract visual information from a pre-trained network (VGG-16) then passes these features to a Long Short-Term Memory (LSTM) recurrent network so that sequential information will be extracted. Features from both modules will then be merged and the class of the music video will be predicted. We did an experiment to know the performance of the proposed system. We collected music data that represent Ethiopian traditional music from Internet-based music archive such as YouTube and personal music collections. After passing the collected data through a pre-processing step, we trained the proposed system, which uses both audio-visual feature and a system that only uses visual feature or audio feature. The performance of the video data only classifier was 78% while the audio data only classifier was 85% and by adding audio feature to the video data only classifier we were able to increase the accuracy of the proposed system by 7 units making its performance 85%.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/12345678/22239
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectDeep Learningen_US
dc.subjectCnnen_US
dc.subjectRnnen_US
dc.subjectTransfer Learningen_US
dc.subjectMusic Information Retrievalen_US
dc.subjectMusic Processingen_US
dc.subjectDance Recognitionen_US
dc.titleAutomatic Classification of Ethiopian Traditional Music Using Audio-Visual Features and Deep Learningen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Selam Mulugeta 2020.pdf
Size:
4.78 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections