WaveNet is a neural network built by Google DeepMind with the objective of providing more realistic-sounding speech for digital uses. It was launched last year, but the hardware at that time was simply not up to the heavy task of making it available commercially. These days, the hardware is so much better, and WaveNet is partnering with Google to give Google Assistant a more natural speaking voice.
Google DeepMind has just announced that WaveNet is now being used to generate Google Assistant voices for the US English and Japanese language across all platforms that support the digital assistant. Now the voices are much more natural, without that digital voice thumbprint that you just know comes from a digital assistant. Check out the source link below to test the samples.
Majority of today’s text-to-speech (TTS) systems use the concatenative TTS, which uses a large database of high-quality recordings, collected from a single voice actor over many hours. These recordings are split into tiny chunks that can then be combined – or concatenated – to form complete utterances as needed. WaveNet uses a new system that leverages on a neural network set to work on a large dataset of voice samples, to find the most natural way words can be linked together.
If your Google Assistant sounds more natural today than before, you will have WaveNet to thank for that.
SOURCE: Deep Mind
[tiimeline]