WelCome to Ambo University Institutional Repository!!

Text To Speech Synthesizer For Hadiyyisa Using Statistical Parametric Speech Synthesis

Show simple item record

dc.contributor.author Demama, Desta
dc.date.accessioned 2023-10-27T11:18:14Z
dc.date.available 2023-10-27T11:18:14Z
dc.date.issued 2021-10
dc.identifier.uri http://hdl.handle.net/123456789/3145
dc.description.abstract This research illustrates the first Text to Speech system for the Hadiyyisa language. The main speech technology target is to create communication between human being and machine. Speech synthesis is performing diverse roles in today's modern human activities like assist disabled and in the telecommunication sector. Hadiyyisa language is one of Ethiopian local language which from Cushitic group and use English like alphabets and additionally some different characters. This research work focused on Statistical parametric speech synthesis based on HMM techniques was chosen for this study because it is a model-based system that requires little storage, has a short run time, and is simple to integrate with small handheld de vices. The process of converting input text into an acoustic waveform is divided into several stages, each with its own set of functional components. The synthesizer is divided into two parts training and testing. Speech source and excitation parameters are derived from a speech database during the training process. Ergodic hidden Markov model used to automatically segment the speech and phonetic transcriptions. The input text is processed to shape phonetic strings, along with the qualified models, during the testing step. Finally, the voice has been synthesized is created from speech parameters. In order to train the system being developed composed four hundred sentences and speeches. The system use tenfold cross validation rule effective training method for the system training set consisting of 90% of the total training set selected randomly, with the remaining 10% used as a hold out set for validation. In this study mean opinion score (MOS) to test the intelligibility and naturalness of synthesized speech and Mel cepstral distortion (MCD) techniques used as objective (experimental) eval uation. We evaluated the effectiveness for text to speech and found that the proposed method can generate more natural spectral parameters and F0 the score above 70%. As objective evaluation the MCD score is 5.6 and as subjective evaluation based on intelligibility and nat uralness synthesized speech using MOS testing techniques results a score of 3.06 and 2.62 correspondingly en_US
dc.language.iso en en_US
dc.publisher Ambo University en_US
dc.subject Statistical Parameter Speech Synthesis en_US
dc.subject Text to Speech en_US
dc.subject Hadiyyisa en_US
dc.title Text To Speech Synthesizer For Hadiyyisa Using Statistical Parametric Speech Synthesis en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AmbouIR


Advanced Search

Browse

My Account