The End of Big Data and the Beginning of Big Data AI
The Hadoop Bubble Quivers As Hortonworks Misses
Last month, Hortonworks announced quarterly results for the first time as a public company and they came below expectations. It had revenues of $12.7 million (up 55% year-over-year), but average Wall Street estimates were $13.42 million. Similarly, Wall Street expected a loss of $2.04 per share and Hortonworks reported a loss of $2.19 per share.
The results could be attributed to a company new to the game of providing guidance to Wall Street. But the company’s management had substantial experience in that department throughout their impressive careers so we must look somewhere else for an explanation. What if November 10, 2014, the day Hortonworks filed the paperwork for its IPO was the beginning of the end of the Hadoop bubble, to quote your humble correspondent? What if December 12, 2014, the day Hortonworks went public, surprising many by its swift action, the bubble “began to quiver and shake preparatory to its bursting”? What if Hortonworks had decided to rush to the exit while expectations were high?
People who had over-inflated expectations—and may have grumbled yesterday “what were we thinking”—should have listened to Mike Stonebraker last August. Here’s what this foremost authority on databases (and serial entrepreneur) said about the new generation of Hadoop from Hortonworks competitor Cloudera:
Impala is architected exactly like all of the shared-nothing parallel SQL DBMSs, serving the data warehouse market. Specifically, notice clearly that the MapReduce layer has been removed, and for good reason. As some of us have been pointing out for years, MapReduce is not a useful internal interface inside a SQL (or Hive) DBMS. Impala was architected by savvy DBMS developers, who know the above pragma. In fact, development activity similar to Impala is being done by both HortonWorks and FaceBook. This, of course, presents the Hadoop vendors with a dilemma. Historically, “Hadoop” referred to the open source version of MapReduce written by Yahoo. However, Impala has thrown this layer out of the stack. How can one be a Hadoop vendor, when Hadoop is no longer in the mainstream stack? The answer is simple: redefine “Hadoop”, and that is exactly what the Hadoop vendors have done. The word “Hadoop” is now used to mean the entire stack.
In my post, I suggested “a few things to ponder when considering the potential success of the current leading Hadoop vendors and whether Hadoop in general is in the first stage of a rapid market expansion or the last stage of a bubble inflating.” One of them was the incorporation of Hadoop and similar tools by established software vendors into their traditional database and information management offerings. Stonebraker is highlighting the opposite, the recasting of Hadoop into what looks like a traditional database technology. He says: “Meanwhile most of the data warehouse vendors support HDFS, and many offer features to support semi-structured data. Hence, the data warehouse market and the Hadoop market will quickly converge.”
Another argument I made was “Hadoop is so 2004 (at least at Google).” Here’s Stonebraker on the subject:
Google must be “laughing in their beer” about now. They invented MapReduce to support the web crawl for their search engine in 2004. A few years ago they replaced MapReduce in this application with BigTable, because they wanted an interactive storage system and MapReduce was batch-only. Hence, the driving application behind MapReduce moved to a better platform a while ago. Now Google is reporting that they see little-to-no future need for MapReduce. It is indeed ironic that Hadoop is picking up support in the general community about five years after Google moved on to better things. Hence, the rest of the world followed Google into Hadoop with a delay of most of a decade. Google has long since abandoned it. I wonder how long it will take the rest of the world to follow Google’s direction and do likewise…
No matter. Here’s what Matthew Hedberg, an analyst for RBC Capital Markets, wrote (according to Investor’s Business Daily) just before the Hortonworks quarterly earnings announcement: “We remain bullish on Hortonworks’ opportunity as a pure play on Hadoop and believe it to be one of the better-positioned disruptive vendors in what could be a once-in-a-decade data replatforming opportunity.” An analyst with Cowen and Co, Jesse Hulsing, expressed a similar bullish sentiment: “The Hadoop market is in early stages of adoption. Our view is that most large enterprises (5,000-plus employees) will have adopted or piloted the technology by fiscal year 2020. The underlying driver of this adoption is the growth in analytic applications, which is driven by rapid growth in new data types and new user types. Hortonworks should benefit from this.”
Maybe the market is indeed going gangbusters and Hortonworks is simply losing to better-equipped competitors, primarily Cloudera?
Apparently anticipating this question, Cloudera issued last week a “momentum press release,” announcing that its 2014 revenues “surpassed $100 million,” calling the results “an indicator of Hadoop’s strong momentum.” Derrick Harris at GigaOm had this to say about the news: “That the company, which is still privately held, would choose to disclose even that much information about its finances speaks to the fast maturation, growing competition and big egos in the Hadoop space.” Similarly, Arik Hesseldahl at Re/Code noted that a “likely motivation for the press release is a battle of optics between Cloudera and its primary rival, Hortonworks… Cloudera may simply be seeking to remind the marketplace which Hadoop company is bigger.”
It is also reminding the marketplace that it’s not going to be subject to the scrutiny accorded public companies anytime soon. Cloudera co-founder and chief strategy officer Mike Olson told Re/Code: “We have no timeline for an IPO, period.” CEO Reilly told Fortune “We’re of the size and scale that we could be a successful public company right now. But we’re so well backed that we don’t need to go public to have access to financing.” Indeed, after riding a bubble and raising a cool $1 billion, who needs Wall Street?
But they need Main Street. Regardless of the close to $1.5 billion in venture capital the key Hadoop competitors left standing—Cloudera, Hortonworks and MapR—have raised, to survive and succeed they need enterprise customers to buy the current (and future) Hadoop incarnations they offer.
In my previous post on the Hadoop Bubble, I quoted a 2014 survey conducted by Wikibon which found that only 36% of the respondents were using Hadoop and the majority of those (64%) were using it in proof-of-concept environments. Even more important to the financial future of Hadoop vendors, Wikibon found that “only 25% of Hadoop practitioners are paying customers of one or another Hadoop vendor. 24% use a free distribution provided by a vendor, but the majority, 51%, roll their own Hadoop downloaded from the Apache Software Foundation.” Don’t you think this has something to do with Hortonworks’ quarterly results?
The author of the excellent Wikibon report, Jeff Kelly, gave a presentation last week, titled The Big Data Money Trail. About 23 minutes into the presentation, Kelly gets to a slide titled (surprise!) “Is this the beginning of the end of the bubble, or is there something next that matters?”
Kelly definitely thinks (or at least thought last week) that Hadoop still matters. He thinks Cloudera and Hortonworks will survive and doesn’t back down from his previous estimates of how big the big data market will get. He predicts three future developments, all helping accelerate big data adoption, but not necessarily (in my opinion) promising for Hadoop vendors: enterprises will overcome their process and culture obstacles for adopting big data technologies; innovation will continue to drive the market because it is based on open source software; and while Hadoop was the “low-hanging fruit,” offering cost saving opportunities, now enterprises will start building “data-driven applications.”
To illustrate the last point, specifically the value that can be created by all these new applications of big data, Kelly reproduced on the slide the results of previous work done by Wikibon which estimated the “spend and value delivered by industrial internet” to reach $1.2 trillion in 2020. Bert Latamore, his colleague at Silicon Angle, wrote in his summary of Kelly’s talk that “Vendors will do well in the Big Data market over the next decade, Kelly predicts, but the real winners will be the companies that harness the technology creatively. He estimates that practitioners will create $1.2 trillion in new value from Big Data over the coming decade.” (Italics mine)
So a 2013 report on the Industrial Internet has metamorphosed into current (and misattributed) estimates of how many dollars are swimming in the big data lake. This is how bubbles rise, and eventually, burst.
Or maybe not, maybe I’m wrong and what we have is the beginning of a solid market for products from disruptive vendors going after “once-in-a-decade data replatforming opportunity.” After all, Hortonworks provided above-consensus guidance for the current quarter and they are much closer than I to what is really happening in the marketplace.
In an interview with Derrick Harris conducted last week, Hortonworks CEO Rob Bearden said that he is not backing off his 2014 prediction that Hadoop will soon become a multi-billion-dollar market and Hortonworks will be a billion-dollar company in terms of revenue. Hadoop is actually just a part — albeit a big one — of a major evolution in the data-infrastructure space, he explained to Harris. As companies start replacing the pieces of their data environments, they’ll do so with the open source options that now dominate new technologies. These include Hadoop, NoSQL databases, Storm, Kafka, Spark and the like. “Open source companies can be very successful in terms of revenue growth and in terms of profitability faster than the old proprietary platforms got there,” Bearden said.
Originally posted on Forbes.com