A Very Short History of Artificial Intelligence (AI)


1308                                  Catalan poet and theologian Ramon Llull publishes Ars generalis ultima (The Ultimate General Art), further perfecting his method of using paper-based mechanical means to create new knowledge from combinations of concepts. llull_arsbrevis

The nine most fundamental principles of the Art. B: goodness, C: greatness, D: eternity/duration, E: power/authority, F: wisdom/instinct, G: will/appetite, H: virtue, I: truth, K: glory. From Ars Brevis

1666                                  Mathematician and philosopher Gottfried Leibniz publishes Dissertatio de arte combinatoria (On the Combinatorial Art), following Ramon Llull in proposing an alphabet of human thought and arguing that all ideas are nothing but combinations of a relatively small number of simple concepts.

1726                                  Jonathan Swift publishes Gulliver’s Travels, which includes a description of the Engine, a machine on the island of Laputa (and a parody of Ars Magna): “a Project for improving speculative Knowledge by practical and mechanical Operations.” By using this “Contrivance,” “the most ignorant Person at a reasonable Charge, and with a little bodily Labour, may write Books in Philosophy, Poetry, Politicks, Law, Mathematicks, and Theology, with the least Assistance from Genius or study.”

1763                                  Thomas Bayes develops a framework for reasoning about the probability of events. Bayesian inference will become a leading approach in machine learning.

1854                                  George Boole argues that logical reasoning could be performed systematically in the same manner as solving a system of equations.

1898                                  At an electrical exhibition in the recently completed Madison Square Garden, Nikola Tesla makes a demonstration of the world’s first radio-controlled vessel. The boat was equipped with, as Tesla described, “a borrowed mind.”

1914                                  The Spanish engineer Leonardo Torres y Quevedo demonstrates the first chess-playing machine, capable of king and rook against king endgames without any human intervention.

1921                                  Czech writer Karel Čapek introduces the word “robot” in his play R.U.R. (Rossum’s Universal Robots). The word “robot” comes from the word “robota” (work).


1925                                  Houdina Radio Control releases a radio-controlled driverless car, travelling the streets of New York City.

1927                                  The science-fiction film Metropolis is released. It feature s a robot double of a peasant girl, Maria, which unleashes chaos in Berlin of 2026—it was the first robot depicted on film, inspiring the art deco look of C-3PO in Star Wars.

1929                                  Makoto Nishimura designs Gakutensoku, Japanese for “learning from the laws of nature,” the first robot built in Japan. It could change its facial expression and move its head and hands via an air pressure mechanism.

1943                                  Warren S. McCulloch and Walter Pitts publish “A Logical Calculus of the Ideas Immanent in Nervous Activity” in the Bulletin of Mathematical Biophysics. This influential paper, in which they discussed networks of idealized and simplified artificial “neurons” and how they might perform simple logical functions, will become the inspiration for computer-based “neural networks” (and later “deep learning”) and their popular description as “mimicking the brain.”

1949                                  Edmund Berkeley publishes Giant Brains: Or Machines That Think in which he writes: “Recently there have been a good deal of news about strange giant machines that can handle information with vast speed and skill….These machines are similar to what a brain would be if it were made of hardware and wire instead of flesh and nerves… A machine can handle information; it can calculate, conclude, and choose; it can perform reasonable operations with information. A machine, therefore, can think.”

1949                                  Donald Hebb publishes Organization of Behavior: A Neuropsychological Theory in which he proposes a theory about learning based on conjectures regarding neural networks and the ability of synapses to strengthen or weaken over time.

1950                                  Claude Shannon’s “Programming a Computer for Playing Chess” is the first published article on developing a chess-playing computer program.

1950                                  Alan Turing publishes “Computing Machinery and Intelligence” in which he proposes “the imitation game” which will later become known as the “Turing Test.”

1951                                  Marvin Minsky and Dean Edmunds build SNARC (Stochastic Neural Analog Reinforcement Calculator), the first artificial neural network, using 3000 vacuum tubes to simulate a network of 40 neurons.

1952                                  Arthur Samuel develops the first computer checkers-playing program and the first computer program to learn on its own.

August 31, 1955              The term “artificial intelligence” is coined in a proposal for a “2 month, 10 man study of artificial intelligence” submitted by John McCarthy (Dartmouth College), Marvin Minsky (Harvard University), Nathaniel Rochester (IBM), and Claude Shannon (Bell Telephone Laboratories). The workshop, which took place a year later, in July and August 1956, is generally considered as the official birthdate of the new field.

December 1955               Herbert Simon and Allen Newell develop the Logic Theorist, the first artificial intelligence program, which eventually would prove 38 of the first 52 theorems in Whitehead and Russell’s Principia Mathematica.

1957                                  Frank Rosenblatt develops the Perceptron, an early artificial neural network enabling pattern recognition based on a two-layer computer learning network. The New York Times reported the perceptron to be “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” The New Yorker called it a “remarkable machine… capable of what amounts to thought.”

1958                                  John McCarthy develops programming language Lisp which becomes the most popular programming language used in artificial intelligence research.

1959                                  Arthur Samuel coins the term “machine learning,” reporting on programming a computer “so that it will learn to play a better game of checkers than can be played by the person who wrote the program.”

1959                                  Oliver Selfridge publishes “Pandemonium: A paradigm for learning” in the Proceedings of the Symposium on Mechanization of Thought Processes, in which he describes a model for a process by which computers could recognize patterns that have not been specified in advance.

1959                                  John McCarthy publishes “Programs with Common Sense” in the Proceedings of the Symposium on Mechanization of Thought Processes, in which he describes the Advice Taker, a program for solving problems by manipulating sentences in formal languages with the ultimate objective of making programs “that learn from their experience as effectively as humans do.”

1961                                  The first industrial robot, Unimate, starts working on an assembly line in a General Motors plant in New Jersey.

1961                                  James Slagle develops SAINT (Symbolic Automatic INTegrator), a heuristic program that solved symbolic integration problems in freshman calculus.

1964                                  Daniel Bobrow completes his MIT PhD dissertation titled “Natural Language Input for a Computer Problem Solving System” and develops STUDENT, a natural language understanding computer program.

1965                                  Herbert Simon predicts that “machines will be capable, within twenty years, of doing any work a man can do.”

1965                                  Herbert Dreyfus publishes Alchemy and AI, arguing that the mind is not like a computer and that there were limits beyond which AI would not progress.

1965                                  I.J. Good writes in “Speculations Concerning the First Ultraintelligent Machine” that “the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.”

1965                                  Joseph Weizenbaum develops ELIZA, an interactive program that carries on a dialogue in English language on any topic. Weizenbaum, who wanted to demonstrate the superficiality of communication between man and machine, was surprised by the number of people who attributed human-like feelings to the computer program.

1965                                  Edward Feigenbaum, Bruce G. Buchanan, Joshua Lederberg, and Carl Djerassi start working on DENDRAL at Stanford University. The first expert system, it automated the decision-making process and problem-solving behavior of organic chemists, with the general aim of studying hypothesis formation and constructing models of empirical induction in science.

1966                                  Shakey the robot is the first general-purpose mobile robot to be able to reason about its own actions. In a Life magazine 1970 article about this “first electronic person,” Marvin Minsky is quoted saying with “certitude”: “In from three to eight years we will have a machine with the general intelligence of an average human being.”

1968                                  The film 2001: Space Odyssey is released, featuring Hal, a sentient computer.

1968                                  Terry Winograd develops SHRDLU, an early natural language understanding computer program.

1969                                  Arthur Bryson and Yu-Chi Ho describe backpropagation as a multi-stage dynamic system optimization method. A learning algorithm for multi-layer artificial neural networks, it has contributed significantly to the success of deep learning in the 2000s and 2010s, once computing power has sufficiently advanced to accommodate the training of large networks.

1969                                  Marvin Minsky and Seymour Papert publish Perceptrons: An Introduction to Computational Geometry, highlighting the limitations of simple neural networks.  In an expanded edition published in 1988, they responded to claims that their 1969 conclusions significantly reduced funding for neural network research: “Our version is that progress had already come to a virtual halt because of the lack of adequate basic theories… by the mid-1960s there had been a great many experiments with perceptrons, but no one had been able to explain why they were able to recognize certain kinds of patterns and not others.”

1970                                  The first anthropomorphic robot, the WABOT-1, is built at Waseda University in Japan. It consisted of a limb-control system, a vision system and a conversation system.

1972                                  MYCIN, an early expert system that used artificial intelligence to identify bacteria causing severe infections and to recommend antibiotics, is developed at Stanford University.

1973                                  James Lighthill reports to the British Science Research Council on the state artificial intelligence research, concluding that “in no part of the field have discoveries made so far produced the major impact that was then promised,” leading to drastically reduced government support for AI research.

1976                                  Computer scientist Raj Reddy publishes “Speech Recognition by Machine: A Review” in the Proceedings of the IEEE, summarizing the early work on Natural Language Processing (NLP).

1978                                  The XCON (eXpert CONfigurer) program, a rule-based expert system assisting in the ordering of DEC’s VAX computers by automatically selecting the components based on the customer’s requirements, is developed at Carnegie Mellon University.

1979                                  The Stanford Cart successfully crosses a chair-filled room without human intervention in about five hours, becoming one of the earliest examples of an autonomous vehicle.

1980                                  Wabot-2 is built at Waseda University in Japan, a musician humanoid robot able to communicate with a person, read a musical score and play tunes of average difficulty on an electronic organ.

1981                                  The Japanese Ministry of International Trade and Industry budgets $850 million for the Fifth Generation Computer project. The project aimed to develop computers that could carry on conversations, translate languages, interpret pictures, and reason like human beings.

1984                                  ­Electric Dreams is released, a film about a love triangle between a man, a woman and a personal computer.

1984                                  At the annual meeting of AAAI, Roger Schank and Marvin Minsky warn of the coming “AI Winter,” predicting an immanent bursting of the AI bubble (which did happen three years later), similar to the reduction in AI investment and research funding in the mid-1970s.

1986                                  First driverless car, a Mercedes-Benz van equipped with cameras and sensors, built at Bundeswehr University in Munich under the direction of Ernst Dickmanns, drives up to 55 mph on empty streets.

October 1986                   David Rumelhart, Geoffrey Hinton, and Ronald Williams publish ”Learning representations by back-propagating errors,” in which they describe “a new learning procedure, back-propagation, for networks of neurone-like units.”

1987                                  The video Knowledge Navigator, accompanying Apple CEO John Sculley’s keynote speech at Educom, envisions a future in which “knowledge applications would be accessed by smart agents working over networks connected to massive amounts of digitized information.”


1988                                  Judea Pearl publishes Probabilistic Reasoning in Intelligent Systems. His 2011 Turing Award citation reads: “Judea Pearl created the representational and computational foundation for the processing of information under uncertainty. He is credited with the invention of Bayesian networks, a mathematical formalism for defining complex probability models, as well as the principal algorithms used for inference in these models. This work not only revolutionized the field of artificial intelligence but also became an important tool for many other branches of engineering and the natural sciences.”

1988                                  Rollo Carpenter develops the chat-bot Jabberwacky to “simulate natural human chat in an interesting, entertaining and humorous manner.” It is an early attempt at creating artificial intelligence through human interaction.

1988                                  Members of the IBM T.J. Watson Research Center publish “A statistical approach to language translation,” heralding the shift from rule-based to probabilistic methods of machine translation, and reflecting a broader shift to “machine learning” based on statistical analysis of known examples, not comprehension and “understanding” of the task at hand (IBM’s project Candide, successfully translating between English and French, was based on 2.2 million pairs of sentences, mostly from the bilingual proceedings of the Canadian parliament).

1988                                  Marvin Minsky and Seymour Papert publish an expanded edition of their 1969 book Perceptrons. In “Prologue: A View from 1988” they wrote: “One reason why progress has been so slow in this field is that researchers unfamiliar with its history have continued to make many of the same mistakes that others have made before them.”

1989                                  Yann LeCun and other researchers at AT&T Bell Labs successfully apply a backpropagation algorithm to a multi-layer neural network, recognizing handwritten ZIP codes. Given the hardware limitations at the time, it took about 3 days (still a significant improvement over earlier efforts) to train the network.

1990                                  Rodney Brooks publishes “Elephants Don’t Play Chess,” proposing a new approach to AI—building intelligent systems, specifically robots, from the ground up and on the basis of ongoing physical interaction with the environment: “The world is its own best model… The trick is to sense it appropriately and often enough.”

1993                                  Vernor Vinge publishes “The Coming Technological Singularity,” in which he predicts that “within thirty years, we will have the technological means to create superhuman intelligence. Shortly after, the human era will be ended.”

1995                                  Richard Wallace develops the chatbot A.L.I.C.E (Artificial Linguistic Internet Computer Entity), inspired by Joseph Weizenbaum’s ELIZA program, but with the addition of natural language sample data collection on an unprecedented scale, enabled by the advent of the Web.

1997                                  Sepp Hochreiter and Jürgen Schmidhuber propose Long Short-Term Memory (LSTM), a type of a recurrent neural network used today in handwriting recognition and speech recognition.

1997                                  Deep Blue becomes the first computer chess-playing program to beat a reigning world chess champion.

1998                                  Dave Hampton and Caleb Chung create Furby, the first domestic or pet robot.

1998                                  Yann LeCun, Yoshua Bengio and others publish papers on the application of neural networks to handwriting recognition and on optimizing backpropagation.

2000                                  MIT’s Cynthia Breazeal develops Kismet, a robot that could recognize and simulate emotions.

2000                                  Honda’s ASIMO robot, an artificially intelligent humanoid robot, is able to walk as fast as a human, delivering trays to customers in a restaurant setting.

2001                                  A.I. Artificial Intelligence is released, a Steven Spielberg film about David, a childlike android uniquely programmed with the ability to love.

2004                                  The first DARPA Grand Challenge, a prize competition for autonomous vehicles, is held in the Mojave Desert. None of the autonomous vehicles finished the 150-mile route.

2006                                  Oren Etzioni, Michele Banko, and Michael Cafarella coin the term “machine reading” in the paper “Machine Reading,” defining it as an inherently unsupervised “autonomous understanding of text.”

2006                                  Geoffrey Hinton publishes “Learning Multiple Layers of Representation,” summarizing the ideas that have led to “multilayer neural networks that contain top-down connections and training them to generate sensory data rather than to classify it,” i.e., the new approaches to deep learning.

2007                                  Fei Fei Li and colleagues at Princeton University start to assemble ImageNet, a large database of annotated images designed to aid in visual object recognition software research.

2009                                  Rajat Raina, Anand Madhavan and Andrew Ng publish “Large-scale Deep Unsupervised Learning using Graphics Processors,” arguing that “modern graphics processors far surpass the computational capabilities of multicore CPUs, and have the potential to revolutionize the applicability of deep unsupervised learning methods.”

2009                                  Google starts developing, in secret, a driverless car. In 2014, it became the first to pass, in Nevada, a U.S. state self-driving test.

2009                                  Computer scientists at the Intelligent Information Laboratory at Northwestern University develop Stats Monkey, a program that writes sport news stories without human intervention.

2010                                  Launch of the ImageNet Large Scale Visual Recognition Challenge (ILSVCR), an annual AI object recognition competition.

2011                                  A convolutional neural network wins the German Traffic Sign Recognition competition with 99.46% accuracy (vs. humans at 99.22%).

2011                                  Watson, a natural language question answering computer, competes on Jeopardy! and defeats two former champions.


2011                                  Researchers at the IDSIA in Switzerland report a 0.27% error rate in handwriting recognition using convolutional neural networks, a significant improvement over the 0.35%-0.40% error rate in previous years.

June 2012                         Jeff Dean and Andrew Ng report on an experiment in which they showed a very large neural network 10 million unlabeled images randomly taken from YouTube videos, and “to our amusement, one of our artificial neurons learned to respond strongly to pictures of… cats.”

October 2012                   A convolutional neural network designed by researchers at the University of Toronto achieve an error rate of only 16% in the ImageNet Large Scale Visual Recognition Challenge, a significant improvement over the 25% error rate achieved by the best entry the year before.

March 2016                      Google DeepMind’s AlphaGo defeats Go champion Lee Sedol.

The Web (especially Wikipedia) is a great source for the history of artificial intelligence. Other key sources include Nils Nilsson, The Quest for Artificial Intelligence: A History of Ideas and Achievements; Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach;  Pedro Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World; and Artificial Intelligence and Life in 2030.

Originally published on Forbes.com

Posted in AI, AI history, Big Data History, Computer History, Data Science History | Tagged | 4 Comments

76% of Business Decision-Makers Believe AI is Fundamental to Future Success


Image | Posted on by | 1 Comment

Hans Rosling, Edutainer

In this spectacular section of ‘The Joy of Stats’ Rosling tells the story of the world in 200 countries over 200 years using 120,000 numbers – in just four minutes. Plotting life expectancy against income for every country since 1810, Hans shows how the world we live in is radically different from the world most of us imagine.

The Guardian:

Swedish academic, whose gift for making data sing brought his innovative ideas to a worldwide audience, dies after year-long illness…

A professor of international health at Sweden’s Karolinska Institute, Rosling liked to call himself an “edutainer”. A talented presenter, whose signature animated data visualisations have featured in dozens of film clips, the statistician used humour and often unlikely objects such as children’s toys, cardboard boxes and teacups to liven up data on wealth, inequality and population.

Rosling’s work featured in a BBC4 documentary on The Joy of Stats, and he presented Don’t Panic – the Truth about Population on BBC2. He was also involved in founding the Swedish chapter of Medécins Sans Frontières, according to Swedish media reports. When the Ebola outbreak led to states of emergency being declared in Liberia and Sierra Leone in 2014, Rosling went out to Monrovia to work with the Liberian government on their emergency response, tracking cases and pinpointing missing data.

Time magazine included him in its 2012 list of the world’s 100 most influential people, saying his “stunning renderings of the numbers … have moved millions of people worldwide to see themselves and our planet in new ways”.

But in an interview in the Guardian, in 2013, he was dismissive about his impact on knowledge. Asked what had surprised him the most about the reaction he had received, he said: “It’s that I became so famous with so little impact on knowledge. Fame is easy to acquire, impact is much more difficult. When we asked the Swedish population how many children are born per woman in Bangladesh, they still think it’s four to five. I have no impact on knowledge. I have only had impact on fame, and doing funny things, and so on.”

Posted in data visualization, Statistics | Tagged | 1 Comment

AI and Data-Driven Marketing Technology


Scott Brinker:

…you can see from this chart that master-martech-analyst David Raab shared at the last MarTech confernece in San Francisco that AI (machine intelligence) is blossoming across the whole industry with vendors of all sizes (and this is only a partial list today)…

…because so many core AI algorithms are essentially commoditized by these open market options, they cease to be a source of competitive advantage by themselves. Instead, strategic advantage with plug-and-play AI is achieved by other means, particularly these two:

  1. Data. The specific data you feed these algorithms makes all the difference. The strategic battles with AI will be won by the scale, quality, relevance, and uniqueness of your data. Data quality will become ever more important — as will services and software to support that mission. Markets for accurate and timely 2nd-party and 3rd-party data will thrive, available on-demand via APIs. AI finally puts big data to good use.

  2. User Interface (UI). AI can be used to create significantly better user experiences with your digital products and services, from predictive features that anticipate what a user will want in a particular context to natural language interfaces — text and voice-based chatbots — that can bypass arcane menu-driven interfaces. The opportunity for AI-UIs to make previously complicated tasks fast and easy is enormous — especially in business applications where we can often state what we want to know or do much more easily than we can figure out how to manually get the $#&!% software to do it for us. (Think of all the multi-month certification courses for enterprise software that have been a barrier to “regular” folks unlocking value from those systems.)



As marketing technology becomes a standard part of managing a business, many companies have become comfortable enough with the concept to shift away from their initial focus on platforms, and to focus instead on data.

When marketers are surveyed about what they hope to gain from adopting marketing tech, “data” is one common response. This might seem like circular logic, but data—specifically, customer data—is the undisputed center of the marketing tech ecosystem.

Posted in AI, Marketing | Tagged , | 1 Comment

Category Kings and Network Effects in Consumer-Tech Markets


Battery Ventures:

“Category kings”, defined as market-share leaders in particular business sectors, often wind up creating the majority of the market value relative to their competition. This advantage is particularly pronounced in technology: According to some research, over 70% of the value created in technology markets is actually generated by the category’s king (think Amazon.com in retail, Facebook in social media, etc.)

In fact, research we recently conducted around this topic showed that five-sixths of the market value generated by these leading tech players comes from businesses driven by “network effects”, the phenomenon of a product or service becoming more valuable as more people use it…

Transactional Marketplaces – Destinations where sellers and consumers meet to buy products or services. Examples here are travel site Priceline, food-delivery service GrubHub, and Chinese e-commerce site Alibaba.

Ad-Driven Marketplaces – These companies—like Zillow, Yelp and TripAdvisor—let consumers use the service for free, but try to get sellers (real estate agents, dry cleaners, hotels) to buy ads to fund the content.

Social Networks – Prominent companies in this category include Facebook, Snapchat and WhatsApp.

Posted in network effects | Tagged | 1 Comment

Digital Transformation Surprisingly Focused on Operations


CIO Journal:

40.8% of CIOs responding to a recent IDC survey said that the focus of their digital initiatives is “improving operations” a opposed to 35.5% who cited “new products” and 34.2% that mentioned “new markets.”




Posted in digital transformation | Tagged | 1 Comment

6 Roads to Prediction: Machine Learning Algorithms (Infographic)


Source: Data Iku

Posted in AI, Machine Learning, neural networks, Uncategorized | 1 Comment