Automatic Classification of Ethiopian Traditional Music Using Audio-Visual Features and Deep Learning

Mulugeta, Selam

Automatic Classification of Ethiopian Traditional Music Using Audio-Visual Features and Deep Learning

dc.contributor.advisor	Getahun, Fekade (PhD)
dc.contributor.author	Mulugeta, Selam
dc.date.accessioned	2020-09-02T10:45:48Z
dc.date.accessioned	2023-11-09T16:18:44Z
dc.date.available	2020-09-02T10:45:48Z
dc.date.available	2023-11-09T16:18:44Z
dc.date.issued	2020-06-06
dc.description.abstract	Music bridges the gap between linguistic and cultural gap and helps connect people. Ethiopia is a country with more than 80 tribes each having their own unique musical sound and style of dance. Distinguishing one from another is not an easy task especially in the era of streaming where lots of music are recorder and released each day through the Internet. Machine learning and recently deep learning is a subfield of machine learning that came to tackle the problem of automating tedious classification tasks previously done by programmers manually crafting the classification rules. Deep learning algorithms automatically learn the classification rules by just looking at the data. In this work, we address the automatic classification of Ethiopian traditional music to their respective locality using audio-visual features. To achieve that we use a deep neural network architecture composed of both convolutional neural network (CNN) and recurrent neural network (RNN). This architecture has an audio feature extracting component, that is composed of a parallel deep CNN and RNN which takes mel-spectrogram of an audio signal as an input and a video feature extracting component. The video feature extracting component uses transfer learning to extract visual information from a pre-trained network (VGG-16) then passes these features to a Long Short-Term Memory (LSTM) recurrent network so that sequential information will be extracted. Features from both modules will then be merged and the class of the music video will be predicted. We did an experiment to know the performance of the proposed system. We collected music data that represent Ethiopian traditional music from Internet-based music archive such as YouTube and personal music collections. After passing the collected data through a pre-processing step, we trained the proposed system, which uses both audio-visual feature and a system that only uses visual feature or audio feature. The performance of the video data only classifier was 78% while the audio data only classifier was 85% and by adding audio feature to the video data only classifier we were able to increase the accuracy of the proposed system by 7 units making its performance 85%.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/12345678/22239
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Deep Learning	en_US
dc.subject	Cnn	en_US
dc.subject	Rnn	en_US
dc.subject	Transfer Learning	en_US
dc.subject	Music Information Retrieval	en_US
dc.subject	Music Processing	en_US
dc.subject	Dance Recognition	en_US
dc.title	Automatic Classification of Ethiopian Traditional Music Using Audio-Visual Features and Deep Learning	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Selam Mulugeta 2020.pdf
Size:: 4.78 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Chemistry