Develop an Audio Search Engine for Amharic Speech Web Resources

dc.contributor.advisorAtnafu, Solomon (PhD)
dc.contributor.authorHassen, Arega
dc.date.accessioned2020-09-21T10:47:19Z
dc.date.accessioned2023-11-29T04:05:55Z
dc.date.available2020-09-21T10:47:19Z
dc.date.available2023-11-29T04:05:55Z
dc.date.issued2019-10-10
dc.description.abstractMost general purpose search engines like Google and Yahoo are designed bearing in mind the English language. As non-resource rich languages have been growing on the web, the number of online non-resource rich speakers is enormously growing. Amharic, which is a morphologically rich language that has strong impact on the effectiveness of information retrieval, is one of the non-resource rich languages with a rapidly growing content on the web in all forma of media like text, speech, and video. With increasing number of online radios, speech based reports and news, retrieving Amharic speech from the web is becoming a challenge that needs attention. As a result, the need to develop speech search engine that handles the specific characteristics of the users’ Amharic language query and retrieves Amharic languages speech web documents becomes more apparent. In this research work, we develop an Audio Search Engine for Amharic speech Web Resources that enables web users for finding the speech information they need in Amharic languages. In doing so, we have enhanced the existing crawler for the Amharic speech web resources, transcribed the Amharic speech, indexed the transcribed speech and developed query preprocessing components for user text based query. As base line tools, We have used open source tools (JSpider, and Datafari) for web document crawling, parsing, indexing, ranking and retrieving and sphinx for speech recognition and transcription. To evaluate the effectiveness of our Amharic speech search engine, precision/recall measures were conducted on the retrieved speech web documents. The experimental results showed that the Amharic speech retrieval engine performed 80% precision on the top 10 results and a recall of 92% of its corresponding retrieval engine. The overall evaluation results of the system are found to be promising.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/22409
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectAudio Search Enginesen_US
dc.subjectAudio Information Retrievalen_US
dc.subjectInformation Retrieval in Amharic Languageen_US
dc.subjectSpeech Crawleren_US
dc.subjectAmharic Speech Identificationen_US
dc.titleDevelop an Audio Search Engine for Amharic Speech Web Resourcesen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Arega Hassen 2019.pdf
Size:
2.23 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: