Speech synthesis is an artificial production of human speech by utilizing computer systems called a speech synthesizer. Several statistical parameters are used in previous works to perform the speech synthesis process, but the vocoder is combined only the simpler model. Due to the lack of sequence of modeling, the quality degrading process is reducing the speech synthesis. Therefore, the Pulse model in log-domain vocoder with whale optimized deep convolution recurrent neural network is applied to investing the vocoder in this work. During this analysis, Mel-Generalized Cepstrum (MGC), maximum voice frequency (MVF), and F0 are applied to processing the signal to extracting the features, and the vocoder is generated successfully. The system's effectiveness is then evaluated using experimental results compared to deep neural networks and traditional recurrent networks.