Creating Speech Synthesis Process by Using Pulse Model in Log-Domain Vocoder With Whale Optimized Deep Convolution Neural Networks

doi:10.21203/rs.3.rs-436506/v1

Download PDF

Research Article

Creating Speech Synthesis Process by Using Pulse Model in Log-Domain Vocoder With Whale Optimized Deep Convolution Neural Networks

https://doi.org/10.21203/rs.3.rs-436506/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Speech synthesis is an artificial production of human speech by utilizing computer systems called a speech synthesizer. Several statistical parameters are used in previous works to perform the speech synthesis process, but the vocoder is combined only the simpler model. Due to the lack of sequence of modeling, the quality degrading process is reducing the speech synthesis. Therefore, the Pulse model in log-domain vocoder with whale optimized deep convolution recurrent neural network is applied to investing the vocoder in this work. During this analysis, Mel-Generalized Cepstrum (MGC), maximum voice frequency (MVF), and F0 are applied to processing the signal to extracting the features, and the vocoder is generated successfully. The system's effectiveness is then evaluated using experimental results compared to deep neural networks and traditional recurrent networks.

Cell Communication and Signaling

Speech synthesis

vocoder

Pulse model in log-domain vocoder

Mel-generalized cepstrum