The specialists of the Chinese lab of Baidu Research, which is owned by the biggest search giant of China, has created an algorithm AI Deep Voice, is able to convert text to speech. Similar projects that existed prior to that, faced with the problem of the rate of synthesis of sound, so to generate a voice predecessors of the Deep Vice required from several minutes to a few hours to correctly and maximally naturalistic play text in a human voice. A new development by Chinese scientists based on neural networks able to convert text to voice in real time.
Deep Voice is able to imitate the timbre, the tone of voice and accent, making them very believable and almost indistinguishable from the real thing, while the voice may be male or female. The developers believe that their technology can be used as digital assistants, to use for recording votes in ideograph or even be used for simultaneous translation of films subtitles.
“This is a real breakthrough from a technical point of view, because we were able to solve complex problems, synthesizing a living language with all its peculiarities,” says Leo Zu, one of the authors of the project.
The creators of the algorithm is explained that project Deep Voice inspired by similar developments, only all its components are running the neural network, while using fairly simple functions, which makes the algorithm very adaptive voice can be adjusted “under itself”, giving new accents and other features.
“Deep learning has led to a revolution in various fields, such as computer vision and speech recognition, and now is the time and voice synthesis. We are pleased to be able to achieve such results, and will continue to work to make the system a “text-voice” more realistic” — gives the Motherboard a quote of the developers.
Don’t forget about our telegram-chat!
Chinese developers have taught the AI to speak in a human voice
Vyacheslav Larionov