Big Data Observations: Definitions, Definitions

big-data“…there’s lots of focus on the ‘big’ aspect of data. It sometimes gives us the image of truckloads of data being heaped upon existing truckloads somewhere up in the cloud, creating a virtual mountain so immense it makes Everest look like a molehill. In our opinion, the focus is on the wrong element: It’s not the big data mountain that matters so much to people, it’s those tiny little spoonfuls we extract whenever we search, chat, view, listen, buy–or do anything else online. The hugeness can intimidate, but the little pieces make us smarter and enable us to keep up with, and make sense of, an accelerating world. We call this the miracle of little data”—Robert Scoble and Shel Israel, The Age of Context

“The size of the data set is about the least interesting aspect of these platforms. It’s time to stop thinking about big data as big data and start looking at these platforms as the next logical step in data management. What we call ‘big data’ is really a building-block approach to databases. Rather than the prepackaged relational systems we have grown accustomed to during the past two decades, we now assemble different pieces (data management, data storage, orchestration, etc.) together in order to fit specific requirements. These platforms, in dozens of different flavors, have more than proved their worth and no longer need to escape the shadow of relational platforms. It’s time to simply think of big data as modular databases”–Adrian Lane, Securosis

“Despite the range and differences existing within each of the aforementioned definitions there are some points of similarity. Notably all definitions make at least one of the following assertions:

Size: the volume of the datasets is a critical factor.

Complexity: the structure, behaviour and permutations of the datasets is a critical factor.

Technologies: the tools and techniques which are used to process a sizable or complex dataset is a critical factor.

The definitions surveyed here all encompass at least one of these factors, most encompass two. An extrapolation of these factors would therefore postulate the following: Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning”–Jonathan Stuart Ward and Adam Barker, “Undefined by Data: A Survey of Big Data Definitions