Artificial intelligence to learn to recognize speech among noise

Virtual assistants and voice recognition well enough learned to “know” what they person says, and follow his commands. But for this to work the same for Siri and Cortana, background noise can become a big problem. To cope with this technical glitch can help the experts of Mitsubishi Electric, who presented a new technology to highlight the speech of one person from the General noise.

Technology of Japanese company called Deep Clustering, the operation of which is built on the principles of machine learning. Artificial intelligence to start with, I learned to distinguish the speech of one person from the General stream of different sounds or noises. The neural network splits the incoming audio data for various elements and analyzes each separately, after which it can process the voice of a man. Such work is observed and “joining” two or more interlocutors.

During the technology demonstration Japanese companies, the system was able to successfully separate the speech of two speakers in one microphone one and the same sentence in different languages. All processing was performed in real time, and the delay does not exceed three seconds. The recognition accuracy was 90 percent, and when the microphone began to speak with three people, the percentage of “hits” fell to 80, which is also a good result. How to tell the authors of the project, Anthony Vetro, Yohei OKATO,

“In contrast to the separation of speech from background noise, the selection of the speech of one person from the “voice” of the noise of people talking at the same time is a daunting task, as the sounds of voices of different people have a lot of features. In most systems the task of separating the voices is solved by installing two or more microphones, but in the case of using only one microphone, with the task of separating the voices can handle only artificial intelligence. To use this technology is possible with high recognition accuracy of voice messages. For example, voice control systems in cars, elevators, appliances and other electronic devices.”

Artificial intelligence to learn to recognize speech among noise
Vladimir Kuznetsov