Amharic Text-Prompted Speaker Verification Model

Berhane, Wolelaw

Amharic Text-Prompted Speaker Verification Model

dc.contributor.advisor	Hailemariam, Sebsibe (PhD)
dc.contributor.author	Berhane, Wolelaw
dc.date.accessioned	2018-06-26T06:47:41Z
dc.date.accessioned	2023-11-04T12:22:32Z
dc.date.available	2018-06-26T06:47:41Z
dc.date.available	2023-11-04T12:22:32Z
dc.date.issued	2011-05
dc.description.abstract	Speaker Verification Is A Biometric Authentication Model That Takes Speech Signal As An Input To Verify A Claimed Speaker. A Speaker Verification Model Extracts Speaker Dependent Characteristics From The Speech Wave Signal So As To Create The Voiceprint Of The Speaker. The Researcher Has Implemented The Logic In An English Speaker Verification Model [45] For Amharic. But The Average Accuracy Obtained For Ten Amharic Words Is 92.93%. The Research Work Is Initiated From This Implementation And Its Subsequent Poor Accuracy Performance. In This Thesis Research Work, Amharic Text-Prompted Speaker Verification Model (ATPSVM) Is Designed And Implemented. The ATPSVM Model Applies Frame-Based Processing To The Speech Wave Signals So That All Samples In A Frame Are Processed Simultaneously. It Extracts Speaker Feature Vectors As Mel Frequency Cepstral Coefficients For Use In Speaker Model Construction. Then It Applies The Parameter Domain (Spectral) Normalization Followed By The Min-Max Normalization On The Speaker Feature Vectors So As To Scale The Feature Vector Values In [0, 1]. Finally, It Applies Support Vector Machine Kernel Functions For Modelling Each Speaker. For A Specific Amharic Word Prompted, It Utilizes One-Against-Each SVM Speaker Modeling Strategy To Maintain The Balance Of The Test Speaker Feature Vectors In The Mixed Features. The ATPSVM Model Prototype Is Evaluated Using Ten Amharic Words. Each Amharic Word Is Uttered Ten Times Repeatedly By 5 Men And 5 Women. So That A Total Of 100 Speech Wave Files Are Recorded From Each Speaker. One Utterance Is Iteratively Taken For Testing While The Remaining 9 Are Used For Training The Speaker On Leave-One-Out Basis. It Iteratively Takes One Utterance Of Each Speaker Against Other 9 Speakers For Testing. The Respective Amharic Text-Prompted Speaker Verification Model Page Xvii Utterance Of Other Speaker Is Taken As Impostor Data Set For The Same. The Remaining Respective 9 Utterances From The Two Speakers Are Taken As Training Data Sets. Thus For Each Amharic Word, The Model Is Evaluated Using 900 Training Data Sets, 900 Client Testing Data Sets And Another 900 Impostor Testing Data Sets. In Total, The ATPSVM Model Is Evaluated Using 9,000 Training Data Sets, 9,000 Client (Target) Testing Data Sets, And Another 9,000 Impostor Testing Data Sets. The Evaluation Of The Model Is Done By Varying The Threshold Theta Between 0.0 And 1.0 With 0.1 Differences. Performance Is Measured In Terms Of False Acceptance, False Rejection, True Acceptance And True Rejection. Then It Is Reported As Precision, Accuracy, Recall, False Acceptance Rate, False Rejection Rate And Equal Error Rate (EER). The ATPSVM Model Is Evaluated Using The SVM Linear, Gaussian Radial Basis Network (RBF), Multilayer Perceptron (MLP), And Polynomial Power 3 Kernel Functions. Best Performance Of The ATPSVM Model Is Obtained When The SVM Polynomial Power 3 Kernel Function Is Applied. The Performance Difference Between The Kernel Functions Follows From The Algorithmic Definition Of The Same. Using The SVM Polynomial Power 3 Kernel Function, For The Ten Amharic Words Experimented, An Average Performance Of 0.25% EER, 99.7% Accuracy, 99.8% Recall And 99.7% Precision Is Obtained. For The Same Kernel, The Performance Of The ATPSVM Model For Each Amharic Word Is Also Evaluated Separately. For 70% Of The Amharic Words Experimented, The Performance Of The Model Is 0.00% EER With 100% Accuracy, 100% Precision, And 100% Recall Values. For The Remaining 30%, Its Performance Is Slightly Lower Than These Values. By Selecting More Discriminative Amharic Words Of Similar Nature To The Seven Words, It Is Possible To Get The Desired Highest Performance From The ATPSVM Model. Keywords: Speaker Verification, Speaker Recognition, Biometric Authentication, Amharic Text-Prompted Speaker Verification, Text-Prompted Speaker Verification	en_US
dc.identifier.uri	http://etd.aau.edu.et/handle/123456789/3461
dc.language.iso	en	en_US
dc.publisher	Addis Ababa University	en_US
dc.subject	Speaker Verification	en_US
dc.subject	Speaker Recognition	en_US
dc.subject	Biometric Authentication	en_US
dc.subject	Amharic Text-Prompted Speaker Verification	en_US
dc.subject	Text-Prompted Speaker Verification	en_US
dc.title	Amharic Text-Prompted Speaker Verification Model	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Wolelaw Berehane.pdf
Size:: 2.4 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science