Triple Point Geometric Hashing based Audio Fingerprinting

Efriem, Desalew

Triple Point Geometric Hashing based Audio Fingerprinting

dc.contributor.advisor	Surafel, Lemma (PhD)
dc.contributor.author	Efriem, Desalew
dc.date.accessioned	2020-10-09T09:54:57Z
dc.date.accessioned	2023-11-04T15:14:42Z
dc.date.available	2020-10-09T09:54:57Z
dc.date.available	2023-11-04T15:14:42Z
dc.date.issued	2020-06-09
dc.description.abstract	Audio ﬁngerprinting is a technique used for exact identiﬁcation of an audio by extracting perceptually relevant audio features and transforming them into condensed reproducible formats. Different approaches are proposed to develop audio ﬁngerprinting system. Based on their baseline assumption, these approaches can be grouped into three categories: Philips, Image Processing and Shazam approach. These audio ﬁngerprinting systems, however, are not usually effective when the audio is distorted. Distortion in an audio might come from different modiﬁcations such as additive noise, speed change, pitch shifting, time stretching and others. Of these modiﬁcations, this thesis focuses on handling the problem of linear speed change in Shazam based audio ﬁngerprinting system. Linear speed change is a common audio modiﬁcation which occurs when the audio is played faster or slower with a constant rate. In this thesis, a Shazam based audio ﬁngerprinting system which is robust to linear speed change is proposed. The proposed approach employs triple point geometric hashing to handle the effect of linear speed change on audio ﬁngerprints. The proposed approach is evaluated using 29,600 query audios, and compared with the baseline work, Shazam and recent Shazam based work, Panako. Evaluation results show that the proposed approach is robust to linear speed change in a range from 30% to 22%. This is a signiﬁcant improvement compared to Panako, which is robust to linear speed change between -12% to 6%, and Shazam which failed to handle 2% linear speed change. In addition to speed change, the proposed approach is evaluated in terms of robustness to additive noise, time stretching and pitch shifting. The results show that the proposed approach is robust to: i) additive noise in a range from -5dB to 20dB, comparable robustness is also exhibited by Shazam and Panako; ii) time stretching in a range from -10% to 8%. This is also an improvement compared to Shazam and Pankao, which are robust to time stretching between -4% to 4%; and, iii) pitch shifting in a range from -4% to 4%, which is comparable robustness with Panako, where Shazam failed to handle 2% pitch shifting.	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/22636
dc.language.iso	en_US	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Audio Fingerprinting	en_US
dc.subject	Audio Identiﬁcation	en_US
dc.subject	Geometric Hashing	en_US
dc.subject	Linear Speed Change	en_US
dc.title	Triple Point Geometric Hashing based Audio Fingerprinting	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Efriem Desalew.pdf
Size:: 2.28 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Engineering