I have no doubt that the next few years will see neural networks turn their attention to yet more tasks, integrate themselves more deeply into industry, and continue to impress researchers with new superpowers. This is all well justified, and I have no intention to belittle the current and future impact of deep learning; however, the optimism about the just what these models can achieve in terms of intelligence has been worryingly reminiscent of the 1960s.
Extrapolating from the last few years’ progress, it is enticing to believe that Deep Artificial General Intelligence is just around the corner and just a few more architectural tricks, bigger data sets and faster computing power are required to take us there. I feel that there are a couple of solid reasons to be much more skeptical.
To begin with, it is a bad idea to intuit how broadly intelligent a machine must be, or have the capacity to be, based solely on a single task. The checkers-playing machines of the 1950s amazed researchers and many considered these a huge leap towards human-level reasoning, yet we now appreciate that achieving human or superhuman performance in this game is far easier than achieving human-level general intelligence. In fact, even the best humans can easily be defeated by a search algorithm with simple heuristics. The development of such an algorithm probably does not advance the long term goals of machine intelligence, despite the exciting intelligent-seeming behaviour it gives rise to, and the same could be said of much other work in artificial intelligence such as the expert systems of the 1980s. Human or superhuman performance in one task is not necessarily a stepping-stone towards near-human performance across most tasks. ……
The many facets of human thought include planning towards novel goals, inferring others’ goals from their actions, learning structured theories to describe the rules of the world, inventing experiments to test those theories, and learning to recognise new object kinds from just one example. Very often they involve principled inference under uncertainty from few observations. For all the accomplishments of neural networks, it must be said that they have only ever proven their worth at tasks fundamentally different from those above. If they have succeeded in anything superficially similar, it has been because they saw many hundreds of times more examples than any human ever needed to.
Deep learning has brought us one branch higher up the tree towards machine intelligence and a wealth of different fruit is now hanging within our grasp. While the ability to learn good features in high dimensional spaces from weak priors with lots of data is both new and exciting, we should not fall into the trap of thinking that most of the problems an intelligent agent faces can be solved in this way. Gradient descent in neural networks may well play a big part in helping to build the components of thinking machines, but it is not, itself, the stuff of thought.
Marvin Minsky’s reflections seem pretty relevant to this post too, and he has seen a thing or two http://www.technologyreview.com/video/543031/marvin-minsky-reflects-on-a-life-in-ai/