Amharic Text-Prompted Speaker Verification Model

dc.contributor.advisorHailemariam, Sebsibe (PhD)
dc.contributor.authorBerhane, Wolelaw
dc.date.accessioned2018-06-26T06:47:41Z
dc.date.accessioned2023-11-04T12:22:32Z
dc.date.available2018-06-26T06:47:41Z
dc.date.available2023-11-04T12:22:32Z
dc.date.issued2011-05
dc.description.abstractSpeaker Verification Is A Biometric Authentication Model That Takes Speech Signal As An Input To Verify A Claimed Speaker. A Speaker Verification Model Extracts Speaker Dependent Characteristics From The Speech Wave Signal So As To Create The Voiceprint Of The Speaker. The Researcher Has Implemented The Logic In An English Speaker Verification Model [45] For Amharic. But The Average Accuracy Obtained For Ten Amharic Words Is 92.93%. The Research Work Is Initiated From This Implementation And Its Subsequent Poor Accuracy Performance. In This Thesis Research Work, Amharic Text-Prompted Speaker Verification Model (ATPSVM) Is Designed And Implemented. The ATPSVM Model Applies Frame-Based Processing To The Speech Wave Signals So That All Samples In A Frame Are Processed Simultaneously. It Extracts Speaker Feature Vectors As Mel Frequency Cepstral Coefficients For Use In Speaker Model Construction. Then It Applies The Parameter Domain (Spectral) Normalization Followed By The Min-Max Normalization On The Speaker Feature Vectors So As To Scale The Feature Vector Values In [0, 1]. Finally, It Applies Support Vector Machine Kernel Functions For Modelling Each Speaker. For A Specific Amharic Word Prompted, It Utilizes One-Against-Each SVM Speaker Modeling Strategy To Maintain The Balance Of The Test Speaker Feature Vectors In The Mixed Features. The ATPSVM Model Prototype Is Evaluated Using Ten Amharic Words. Each Amharic Word Is Uttered Ten Times Repeatedly By 5 Men And 5 Women. So That A Total Of 100 Speech Wave Files Are Recorded From Each Speaker. One Utterance Is Iteratively Taken For Testing While The Remaining 9 Are Used For Training The Speaker On Leave-One-Out Basis. It Iteratively Takes One Utterance Of Each Speaker Against Other 9 Speakers For Testing. The Respective Amharic Text-Prompted Speaker Verification Model Page Xvii Utterance Of Other Speaker Is Taken As Impostor Data Set For The Same. The Remaining Respective 9 Utterances From The Two Speakers Are Taken As Training Data Sets. Thus For Each Amharic Word, The Model Is Evaluated Using 900 Training Data Sets, 900 Client Testing Data Sets And Another 900 Impostor Testing Data Sets. In Total, The ATPSVM Model Is Evaluated Using 9,000 Training Data Sets, 9,000 Client (Target) Testing Data Sets, And Another 9,000 Impostor Testing Data Sets. The Evaluation Of The Model Is Done By Varying The Threshold Theta Between 0.0 And 1.0 With 0.1 Differences. Performance Is Measured In Terms Of False Acceptance, False Rejection, True Acceptance And True Rejection. Then It Is Reported As Precision, Accuracy, Recall, False Acceptance Rate, False Rejection Rate And Equal Error Rate (EER). The ATPSVM Model Is Evaluated Using The SVM Linear, Gaussian Radial Basis Network (RBF), Multilayer Perceptron (MLP), And Polynomial Power 3 Kernel Functions. Best Performance Of The ATPSVM Model Is Obtained When The SVM Polynomial Power 3 Kernel Function Is Applied. The Performance Difference Between The Kernel Functions Follows From The Algorithmic Definition Of The Same. Using The SVM Polynomial Power 3 Kernel Function, For The Ten Amharic Words Experimented, An Average Performance Of 0.25% EER, 99.7% Accuracy, 99.8% Recall And 99.7% Precision Is Obtained. For The Same Kernel, The Performance Of The ATPSVM Model For Each Amharic Word Is Also Evaluated Separately. For 70% Of The Amharic Words Experimented, The Performance Of The Model Is 0.00% EER With 100% Accuracy, 100% Precision, And 100% Recall Values. For The Remaining 30%, Its Performance Is Slightly Lower Than These Values. By Selecting More Discriminative Amharic Words Of Similar Nature To The Seven Words, It Is Possible To Get The Desired Highest Performance From The ATPSVM Model. Keywords: Speaker Verification, Speaker Recognition, Biometric Authentication, Amharic Text-Prompted Speaker Verification, Text-Prompted Speaker Verificationen_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/3461
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectSpeaker Verificationen_US
dc.subjectSpeaker Recognitionen_US
dc.subjectBiometric Authenticationen_US
dc.subjectAmharic Text-Prompted Speaker Verificationen_US
dc.subjectText-Prompted Speaker Verificationen_US
dc.titleAmharic Text-Prompted Speaker Verification Modelen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Wolelaw Berehane.pdf
Size:
2.4 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description:

Collections