Oren Etzioni on Building Intelligent Machines

“There are more things in AI than classification… the entire paradigm of classification, which has fueled machine learning and data mining, is very limited… What we need is a process that is structured, multi-layered, knowledge-intensive, much more like kids playing Lego, instead of a karate chop that divides things into categories… Current knowledge bases are fact-rich but knowledge poor…’You can’t play 20 questions with Nature and win’ (Allen Newell, 1973)… What we need is knowledge, reasoning, and explanation.”

Slides (from KDD Keynote, scroll all the way down the page) are here 

From GigaOm:

Oren Etzioni, executive director of the Allen Institute of Artificial Intelligence (formerly founder of Farecast and Decide.com), takes a contrarian view of all the deep learning hype. Essentially, he argues, while systems that are better than ever at classifying images or words are great, they’re still not “intelligent.” He describes work underway to build systems that can truly understand content, including one capable of passing fourth-grade short-answer exams.

Etzioni on reddit:

“I think that poeple are often confusing computer autonomy with computer intelligence. computer viruses are autonomous, dangerous, but not particularly intelligent. Chess playing programs are intelligent (in a sense) but very limited. They don’t even play checkers!”

“I love star trek and particularly the star trek computer because AI is used there as a tool to help and inform Captain Kirk and the crew. That’s a much better model than fear mongering in movies like HER and Transcendence. AI can be used to help us and enhance our abilities. For example, we are all inundated with huge amounts of text, articles, technical papers, and nobody can keep up! How about if your doctor had a tool that would help him or her to figure out the latest studies and procedures relevant to your condition? Even better—what if you had a tool to help figure out what’s going on that’s much better [than] google or webmd.”

[Why do you enjoy working on AI? What first motivated you to get into the field?] “It is one of the most fundamental intellectual problems and it’s really, really hard. I find computers so rigid, so stupid that it’s infuriating. My goal is to fight “artificial stupidity” and to build AI programs that help scientists, doctors, and regular folks make sense of the world and the tsunami of information that we all face every day.”

The Turing test is about tricking someone to believe that a computer is human. At AI2 we are working on programs that will try to pass tests in science & math which requires the program to understand the questions (hard) utilize background knowledge (even harder) and accumulate that knowledge automatically (great big challenge).”

[Would the IBM computer that played Jeopardy be called intelligent by your metrics?] “Watson was an impressive demonstration but it was narrowly targeted at Jeopardy and exhibited very little semantic understanding. Now Watson has become an IBM brand for any knowledge based activity they do. The intelligence is largely in their PR department.”



Posted in AI, Intelligent Machines, Machine Learning, Recommendations | Leave a comment

The New Apple Wristop Computer: Not Designed for the Internet of Things

apple-watch-100413659-posterMIT Media Lab cofounder Nicholas Negroponte observed at a recent TED event that “I look today at some of the work being done around the Internet of Things and it’s kind of tragically pathetic.”

The “tragically pathetic” label has been especially fitting for wearables, considered the hottest segment of the Internet of Things.  Lauren Goode at Re/Code wrote back in March: “Let me guess: Your activity-tracking wristband is sitting on your dresser or in a drawer somewhere right now, while it seems that every day there’s a news report out about an upcoming wearable product that’s going to be better, cooler, smarter.”

All of this was going to change when Apple finally entered the category with its smart watch. Many observers hoped that Apple’s design principles, obsession with simplicity, and track record of delighting users with easy-to-use products, are going to finally give the world a useful and fun wearable.

Instead, we got a good-looking wrist-top computer. Not a simple, intuitive, and focused device but a generic, complex product with too many functions and options. Kevin McCullagh wrote in fastcodesing.com: “I can’t help but think Steve Jobs would have stopped the kitchen sink being thrown in like this. Do we really need photos and maps on a stamp-sized screen, when our phones are rarely out of reach? For all the claims of a ‘thousand no’s for every yes,’ the post-Jobs era is shaping up to be defined by less ruthless focus.” Back in June, Adam Lashinsky already made this general observation about the potential loss of the famed product development discipline: “Apple, once the epitome of simplicity, is becoming the unlikely poster child for complexity.”

“Complexity,” however, does not tell the whole story. By introducing a watch that is basically a computer on your wrist, Apple missed an opportunity not just to reorient the wearables market to something much better than “tragically pathetic,” but also to define the design and usability principles for the Internet of Things.

In his TED talk, Negroponte highlighted what he called “not a particularly enlightened view of the Internet of Things.” This is the tendency to move the intelligence (or functionality of many devices) into the cell phone (or the wearable), instead of building the intelligence into the “thing,” whatever the thing is – the oven, the refrigerator, the road, the walls, all the physical things around us. More generally, it is the tendency to continue evolving the current computer paradigm—from the mainframe to the laptop to the wristop computer—instead of developing a completely new Internet of Things paradigm.

The new paradigm should embrace and evolve the principles of what was once called “ubiquitous computing.” The history of that vision over the last two decades may help illuminate where the Internet of Things is today and where it may or may not go.

In 1991, Mark Weiser, then head of the Computer Science Lab at Xerox PARC, published an article in Scientific American titled “The Computer for the 21st Century.” The article opens with what should be the rallying cry for the Internet of Things today: “The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.”

Weiser went on to explain what was wrong with the personal computing revolution brought on by Apple and others: “The arcane aura that surrounds personal computers is not just a ‘user interface’ problem. My colleague and I at the Xerox Palo Alto Research Center think that the idea of a ‘personal’ computer itself is misplaced and that the visions of laptop machines, dynabooks and ‘knowledge navigators’ is only a transitional step toward achieving the real potential of information technology.  Such machines cannot truly make computing an integral, invisible part of people’s lives.”

Weiser understood that, conceptually, the PC was simply a mainframe on a desk, albeit with easier-to-use applications.  He misjudged, however, the powerful and long-lasting impact that this new productivity and life-enhancing tool would exert on millions of users worldwide. Weiser wrote: “My colleagues and I at PARC believe that what we call ubiquitous computing will gradually emerge as the dominant mode of computer access over the next 20 years. … [B]y making everything faster and easier to do, with less strain and fewer mental gymnastics, it will transform what is apparently possible. … [M]achines that fit the human environment instead of forcing humans to enter theirs will make using a computer as refreshing as taking a walk in the woods.”

Ubiquitous computing has not become the “dominant mode of computer access” mostly because of Steve Jobs’ Apple. It successfully invented variations on the theme of the Internet of Computers: The iPod, the iPhone, the iPad. All of them beautifully designed, easy-to-use, and useful. All of them cementing and enlarging the dominance of the Internet of Computers paradigm. Now Apple has extended the paradigm by inventing a wristop computer. That the Apple Watch is more complex and less focused than Apple’s previous successful inventions matters less than the fact that it continues in their well-trodden path.

While the dominant paradigm has been reinforced and expanded by the successful innovations of Apple and others, the vision of ubiquitous computing has not died. Today, when we are adding intelligence to things at an accelerating rate, it is more important than ever. Earlier this year, I asked Bob Metcalfe what is required to make us happy with our Internet of Things experience. “Not so much good UX, but no UX at all,” he said. “The IoT should disappear into the woodwork, even faster than Ethernet has.” Metcalfe invented the Ethernet at Xerox PARC at the time Weiser and others were working on making computers disappear.

Besides ubiquity, there are at least two other dimensions to the new paradigm of the Internet of Things. One is seamless connectivity. In response to the same question, Google’s Hal Varian told me, “I think that the big challenge now is interoperability. Given the fact that there will be an explosion of new devices, it is important that they talk to each other. For example, I want my smoke alarm to talk to my bedroom lights, and my garden moisture detector to talk to my lawn sprinkler.” No more islands of computing, a hallmark of the Internet of (isolated) Computers.

Another important dimension of the new paradigm is useful data. Not big or small, nor irrelevant or trapped in a silo, just useful. The value of the “things” in the Internet of Things paradigm is measured by how well the data they collect is analyzed and how quickly useful feedback based on this analysis is delivered to the user.

Disappearing into the woodwork. All things talking to all things. Useful data. It may not be Apple, but the company or companies that will master these will usher in the new era of the Internet of Things where we finally get over our mainframe/PC/Wristop computer habit.

[Originally published on Forbes.com]

Posted in Apple, Internet of Things | 1 Comment

U.S. Government’s Data Explosion (Infographic)

BD_Scality-InfographicSource: Scality



Posted in Data growth, Federal Government | 2 Comments

Big Data and Intuition: Status Update

BD_accenture0914Two new reports on big data and big decisions were released last week by Accenture and PwC. Both reports shed new light on the impact of big data on enterprises today, and how it is changing the process of decision making by senior executives.

An Accenture worldwide survey of senior technology and business executives has found that companies that have completed at least one big data project are happy with the results. 60% of executives said their companies have successfully completed a big data implementation, 36% haven’t pursued a big data project yet and 4% were currently pursuing but hadn’t finished their first big data project. 92% of executives from companies that have completed a big data implementation are satisfied with the results and 89% rated big data as “very important” or “extremely important” to their businesses’ digital transformation.

Other highlights include:

  • Executives see tangible business outcomes from big data in finding new sources of revenue (56%), enhancing the customer experience (51%), new product and service development (50%), and winning and keeping customers (47%).
  • Challenges in implementing big data are security (51%); budget (47%); lack of big data implementation talent (41%); lack of talent for big data and analytics on an ongoing basis (37%); and integration with existing systems (35%).

In welcome news to the graduates of the numerous data science and business analytics programs, nearly all (91%) companies expect to increase their data science expertise, the majority within the next year. 54% of executives said their companies have already developed internal technical training opportunities for their employees and most organizations also tap outside expertise. Only 5% of respondents said their company used only internal resources for their big data implementations.

The report also reveals that “many companies have different definitions of big data.” Indeed, in answer to the question “Which of the following do you consider part of big data?” responses varied from “Large data files” (65%), to “Advanced analytics or analysis” (60%) to “Data from visualization tools (50%).”

Regardless of how they define big data, 89% believe big data will revolutionize business operations in the same way the Internet did. 85% feel big data will dramatically change the way they do business and 79% agree that “companies that do not embrace big data will lose their competitive position and may even face extinction.” More than anything, in my opinion, these answers reflect the persuasive powers of a buzzword, inherent in its dual, attention-getting role: The threat of “disruption” and the promise of a “revolution” (two words used to great effect by the report).

The PwC report, titled “Gut & Gigabytes: Capitalising on the art & science in decision making,” is based on an Economist Intelligence Unit (EIU) worldwide survey of 1,135 senior executives. It clearly defines big data as “the recent wave of electronic information produced in greater volume by a growing number of sources (i.e. not just data collected by a particular organisation in the course of normal business).”

Intuition has got a bad rap in the age of big data and the most interesting finding of this report is that 30% of executives admit that intuition is what they most relied on when they made their last big decision. An additional 28% relied on other people’s intuition (“advice or experience of others internally”). Only 30% said that “data and analysis (internal or external)” is what they relied on for their last big decision and another 9% relied on “financial indicators.”

The reliance on intuition and experience is based on… experience. 46% of executives said that relying on data analysis has been detrimental to their business in the past. They are concerned about quality, accuracy and completeness of data and find it difficult to access useful data.

Still, 64% of the executives surveyed said that big data has changed decision-making in their organizations and 25% expect it will do so over the next two years. And 49% of executives agree that data analysis is undermining the credibility of intuition or experience, compared with 21% who disagree. Says the report: “In reality, however, experience and intuition, and data and analysis, are not mutually exclusive. The challenge for business is how best to marry the two. A ‘gut instinct’ nowadays is likely to be based on increasingly large amounts of data, while even the largest data set cannot be relied upon to make an effective big decision without human involvement.”

[Originally published on Forebs.com]

Posted in Big Data Analytics, Intuition, Stats, Surveys | 4 Comments

How Data Centers Work? (Infographic)

A Look Inside U.S. Data Centers - Via Who Is Hosting This: The Blog

Source: WhoIsHostingThis.com

Posted in Data growth, Internet, IT | Leave a comment

What’s the Big Data? 12 Definitions

BigData_freeLicensesLast week I got an email from UC Berkeley’s Master of Information and Data Science program, asking me to respond to a survey of data science thought leaders, asking the question “What is big data”? I was especially delighted to be regarded as a “thought leader” by Berkeley’s School of Information, whose previous dean, Hal Varian (now chief economist at Google, answered my challenge fourteen years ago and produced the first study to estimate the amount of new information created in the world annually, a study I consider to be a major milestone in the evolution of our understanding of big data.

The Berkeley researchers estimated that the world had produced about 1.5 billion gigabytes of information in 1999 and in a 2003 replication of the study found out that amount to have doubled in 3 years. Data was already getting bigger and bigger and around that time, in 2001, industry analyst Doug Laney described the “3Vs”—volume, variety, and velocity—as the key “data management challenges” for enterprises, the same “3Vs” that have been used in the last four years by just about anyone attempting to define or describe big data.

The first documented use of the term “big data” appeared in a 1997 paper by scientists at NASA, describing the problem they had with visualization (i.e. computer graphics) which “provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data. When data sets do not fit in main memory (in core), or when they do not fit even on local disk, the most common solution is to acquire more resources.”

In 2008, a number of prominent American computer scientists popularized the term, predicting that “big-data computing” will “transform the activities of companies, scientific researchers, medical practitioners, and our nation’s defense and intelligence operations.” The term “big-data computing,” however, is never defined in the paper.

The traditional database of authoritative definitions is, of course, the Oxford English Dictionary (OED). Here’s how the OED defines big data: (definition #1) “data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges.”

But this is 2014 and maybe the first place to look for definitions should be Wikipedia. Indeed, it looks like the OED followed its lead. Wikipedia defines big data (and it did it before the OED) as (#2) “an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications.”

While a variation of this definition is what is used by most commentators on big data, its similarity to the 1997 definition by the NASA researchers reveals its weakness. “Large” and “traditional” are relative and ambiguous (and potentially self-serving for IT vendors selling either “more resources” of the “traditional” variety or new, non-“traditional” technologies).

The widely-quoted 2011 big data study by McKinsey highlighted that definitional challenge. Defining big data as (#3) “datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze,” the McKinsey researchers acknowledged that “this definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data.” As a result, all the quantitative insights of the study, including the updating of the UC Berkeley numbers by estimating how much new data is stored by enterprises and consumers annually, relate to digital data, rather than just big data, e.g., no attempt was made to estimate how much of the data (or “datasets”) enterprises store is big data.

Another prominent source on big data is Viktor Mayer-Schönberger and Kenneth Cukier’s book on the subject. Noting that “there is no rigorous definition of big data,” they offer one that points to what can be done with the data and why its size matters:

(#4) “The ability of society to harness information in novel ways to produce useful insights or goods and services of significant value” and “…things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value.”

In Big Data@Work, Tom Davenport concludes that because of “the problems with the definition” of big data, “I (and other experts I have consulted) predict a relatively short life span for this unfortunate term.” Still, Davenport offers this definition:

(#5) “The broad range of new and massive data types that have appeared over the last decade or so.”

Let me offer a few other possible definitions:

(#6) The new tools helping us find relevant data and analyze its implications.

(#7) The convergence of enterprise and consumer IT.

(#8) The shift (for enterprises) from processing internal data to mining external data.

(#9) The shift (for individuals) from consuming data to creating data.

(#10) The merger of Madame Olympe Maxime and Lieutenant Commander Data.

#(11) The belief that the more data you have the more insights and answers will rise automatically from the pool of ones and zeros.

#(12) A new attitude by businesses, non-profits, government agencies, and individuals that combining data from multiple sources could lead to better decisions.

I like the last two. #11 is a warning against blindly collecting more data for the sake of collecting more data (see NSA). #12 is an acknowledgment that storing data in “data silos” has been the key obstacle to getting the data to work for us, to improve our work and lives. It’s all about attitude, not technologies or quantities.

What’s your definition of big data?

See here for the compilation of Big data definitions from 40+ thought leaders.

[Originally published on Forbes.com]

Posted in Big Data Analytics, Big Data History | 2 Comments

The nature of data (Infographic)


Posted in Data growth, Infographics | 1 Comment