Last month, the humanity lost an important battle with artificial intelligence — then AlphaGo beat the champion in th Ki Jae with the score 3:0. AlphaGo is a program with artificial intelligence developed by DeepMind, part of parent company Google Alphabet. Last year, she defeated another champion, If Sedola, 4:1, but since then has significantly gained on points.
Ki Jae AlphaGo described as a “God game”.
Now AlphaGo finishes playing games, giving players the opportunity to, as before, to fight among themselves. Artificial intelligence has acquired the status of a “player from the distant future” to which people will have to grow very long.
On your mark, get set, go
Go is an ancient game for two, where one plays white, the other black. Task — to capture domination on the Board divided by 19 horizontal and 19 vertical lines. Computers to play go is harder than chess, because the number of possible moves in each position a lot more. This makes the calculation of potential moves in advance — it is possible for computers in chess is very difficult.
DeepMind breakthrough was the development of a General learning algorithm which, in principle, could be directed to more socially oriented direction. DeepMind says a group of researchers AlphaGo trying to solve complex problems, such as finding new cures for diseases, the radical reduction of energy consumption or the development of new revolutionary materials.
“If the AI system proves that you can gain new knowledge and strategies in these areas, the breakthroughs will just indescribable. Can’t wait to see what will happen next,” says one of the scientists of the project.
In the future, it faces many exciting opportunities, but the problem is still there.
Neuroscience and artificial intelligence
AlphaGo combines two powerful ideas on the subject of teaching, which have developed over the past few decades: deep learning and reinforcement learning. Interestingly, both came from the biological concept of the work and training of the brain in the process of gaining experience.
In the human brain sensory information is processed in a series of layers. For example, the visual information first transformirovalsya in the retina, then to the midbrain and then passes through the different areas of the cerebral cortex.
As a result, there is a hierarchy of predstaveni, where at first there are simple and localized items, and then more difficult and complex features.
The equivalent of AI called deep learning: deep, because it includes many layers of processing in the simple neuronopathy computing units.
But to survive in this world, animals need not only to recognize sensory information, but to act in accordance with it. Generation of scientists and psychologists have studied how animals learn to take actions to maximize benefit and obtain a reward.
All this has led to mathematical theories of reinforcement learning, which can now be implemented in AI systems. The most important of these is the so-called TD-learning, which improves the action by maximizing the expectation of the future awards.
Best moves
Through a combination of deep learning and reinforcement learning in a series of artificial neural networks, AlphaGo first learned to play at the level of a professional go player on the basis of 30 million moves of games between people.
But then he began to play against myself using the outcome of each game is to relentlessly hone their own decisions about the best move in each position on the Board. The system of values of the network learned to predict the probable outcome with respect to any position, and the system of prudence, the network learned to make the best decision in each situation.
Although AlphaGo could not try all possible positions on the Board, the neural network has learned the key ideas about strategies that work well in any position. It is these countless hours of independent games has led to improved AlphaGo over the last year.
Unfortunately, as yet no known way to find out from your network, what kind of key ideas. We can just study the game and hope that get something out of them. This is one of the problems of the use of neural algorithms: they do not explain their decisions.
We still understand very little about how biological brains learn, and neuroscience continues to provide new sources of inspiration for AI. People can become experts in the game of go, guided much less experience than needed AlphaGo to achieve this level, so room for improvement of algorithms is still there.
In addition, most of the power AlphaGo based on the technique of the method of error back propagation, which helps her to correct mistakes. But the connection between it and the training in a real brain is still unclear.
What’s next?
The game of go is a convenient development platform for optimization of the learning algorithms. But many real-world problems where messy and have less learning opportunities (e.g., self-driving cars).
Are there problems to which we can apply existing algorithms?
One example may be to optimize controlled industrial conditions. Here the task often is to perform a complex series of tasks to satisfy multiple criteria and to minimize costs.
Until then, until the conditions can be accurately modeled, these algorithms will learn and gain experience faster and more efficient than humans. You can only repeat the words of the company DeepMind: I’d like to see what will happen next.
Game ended: AlphaGo will be engaged in solving real world problems
Ilya Hel