Google “spoke” in a human voice

Experts search giant Google has published an article that talked about the fact that I have created a speech synthesizer capable of speaking indistinguishable from a live human voice. The development is called Tacotron 2 and is able to very efficiently convert text to speech.

The program consists of two interconnected neural networks deep learning. The first neural network generates a text-based spectrogram and transmits it to the second algorithm WaveNet, who voiced her “voice”. Tacotron 2 knows a lot of nuances, easily cope with difficult pronunciation of words and reading from a sheet, takes into account punctuation. Because of this, for example, it distinguishes the end of a sentence and the beginning of a new, highlighting their intonation.

Work samples application specialists are already posted on the page dedicated to the development. It sounds much better than the monotonous robotic voices of modern sound, so, presumably, Google will quickly find the development application. WaveNet is already used in Google Assistant, so Tacotron 2 will certainly be an excellent addition.

At this stage of development Tacotron 2 only says a pleasant female voice, but probably in the future will get the male version, and, given her ability to learn may learn and imitate other voices.

Google “spoke” in a human voice
Vyacheslav Larionov