10 Most Successful Big Data Technologies

Forrester graphic

As the big data analytics market rapidly expands to include mainstream customers, which technologies are most in demand and promise the most growth potential? The answers can be found in TechRadar: Big Data, Q1 2016, a new Forrester Research report evaluating the maturity and trajectory of 22 technologies across the entire data life cycle. The winners all contribute to real-time, predictive, and integrated insights, what big data customers want now.

Here is my take on the 10 hottest big data technologies based on Forrester’s analysis:

  1. Predictive analytics: software and/or hardware solutions that allow firms to discover, evaluate, optimize, and deploy predictive models by analyzing big data sources to improve business performance or mitigate risk.
  2. NoSQL databases: key-value, document, and graph databases.
  3. Search and knowledge discovery: tools and technologies to support self-service extraction of information and new insights from large repositories of unstructured and structured data that resides in multiple sources such as file systems, databases, streams, APIs, and other platforms and applications.
  4. Stream analytics: software that can filter, aggregate, enrich, and analyze a high throughput of data from multiple disparate live data sources and in any data format.
  5. In-memory data fabric: provides low-latency access and processing of large quantities of data by distributing data across the dynamic random access memory (DRAM), Flash, or SSD of a distributed computer system.
  6. Distributed file stores: a computer network where data is stored on more than one node, often in a replicated fashion, for redundancy and performance.
  7. Data virtualization: a technology that delivers information from various data sources, including big data sources such as Hadoop and distributed data stores in real-time and near-real time.
  8. Data integration: tools for data orchestration across solutions such as Amazon Elastic MapReduce (EMR), Apache Hive, Apache Pig, Apache Spark, MapReduce, Couchbase, Hadoop, and MongoDB.
  9. Data preparation: software that eases the burden of sourcing, shaping, cleansing, and sharing diverse and messy data sets to accelerate data’s usefulness for analytics.
  10. Data quality: products that conduct data cleansing and enrichment on large, high-velocity data sets, using parallel operations on distributed data stores and databases.

Forrester’s TechRadar methodology evaluates the potential success of each technology and all 10 above are projected to have “significant success.” In addition, each technology is placed in a specific maturity phase—from creation to decline—based on the level of development of its technology ecosystem. The first 8 technologies above are considered to be in the Growth stage and the last 2 in the Survival stage.

Forrester also estimates the time it will take the technology to get to the next stage and predictive analytics is the only one with a “>10 years” designation, expected to “deliver high business value in late Growth through Equilibrium phase for a long time.” Technologies #2 to #8 above are all expected to reach the next phase in 3 to 5 years and the last 2 technologies are expected to move from the Survival to the Growth phase in 1-3 years.

Finally, Forrester provides for each technology an assessment of its business value-add, adjusted for uncertainty. This is based not only on potential impact but also on feedback and evidence from implementations and market reputation. Says Forrester: “If the technology and its ecosystem are at an early stage of development, we have to assume that its potential for damage and disruption is higher than that of a better-known technology.” The first 2 technologies in the list above are rated as “high” business value-add, the next 2 as “medium,” and all the rest “low,” no doubt because of their emerging status and lack of maturity.

Why did I add to the list of hottest technologies two that are still in the Survival phase—data preparation and data quality? In the same report, Forrester also provides the following data from its Q4 2015 survey of 63 big data vendors:

What is the level of customer interest in each of the following capabilities? (% answering “very high”)

Data preparation and discovery                                    52%

Data integration                                                               48%

Advanced analytics                                                          46%

Customer analytics                                                          46%

Data security                                                                     38%

In-memory computing                                                    37%

While Forrester predicts that a few standalone vendors of data preparation will survive, it believes this is “an essential capability for achieving democratization of data,” or rather, its analysis, letting data scientists spend more time on modeling and discovering insights and allowing more business users to have fun with data mining.  Data Quality includes data security from the table above, in addition to other features ensuring decisions are based on reliable and accurate data. Forrester “expects that data quality will have significant success in the coming years as firms formalize a data certification process. Data certification efforts seek to guarantee that data meets expected standards for quality; security; and regulatory compliance supporting business decision-making, business performance, and business processes.”

“Big Data” as a topic of conversation has reached mainstream audiences probably far more than any other technology buzzword before it. That did not help the discussion of this amorphous term, defined for the masses as “the planet’s nervous system” (see my rant here) or as “Hadoop” for technical audiences.  Forrester’s report helps clarify the term, defining big data as the ecosystem of 22 technologies, each with its specific benefits for enterprises and, through them, consumers.

Big data, specifically one its attributes, big volume, has recently gave rise to a new general topic of discussion, Artificial Intelligence. The availability of very large data sets is one of the reasons Deep Learning, a sub-set of AI, has been in the limelight, from identifying Internet cats to beating a Go champion.  In its turn, AI may lead to the emergence of new tools for collecting and analyzing data.

Says Forrester: “In addition to more data and more computing power, we now have expanded analytic techniques like deep learning and semantic services for context that make artificial intelligence an ideal tool to solve a wider array of business problems. As a result, Forrester is seeing a number of new companies offering tools and services that attempt to support applications and processes with machines that mimic some aspects of human intelligence.”

Prediction is difficult, especially about the future, but it’s a (relatively) safe bet that the race to mimic elements of human intelligence, led by Google, Facebook, Baidu, Amazon, IBM, and Microsoft, all with very deep pockets, will change what we mean by “big data” in the very near future.

Originally posted on Forbes.com