The Reality of Big Data: Findings from Recent Surveys

Big data tools and technologies emerged first from the companies the Web gave birth to–Google, Facebook, Yahoo, and Amazon. No wonder that the term has become associated primarily with the ability to process and analyze large sets of unstructured, web-generated data, for consumer- and market-related activities such as targeted advertising or improving customer loyalty.  

A number of recent surveys, however, paint a somewhat different picture.  It looks like early enterprise adopters of big data are using these new technologies to improve internal operations and to extend the value of the relational data they have worked with for many years.   The surveys, one published by Informatica, and two others discussed in a recent IDC webinar, also provide interesting insights into early experiences with big data tools.

The Informatica worldwide survey of close to 600 (mostly) IT professionals found that improving operational efficiency is the top business driver for big data projects (74 percent of respondents). Attracting and retaining customers ranked fourth (49%), after two other business process and efficiency-related drivers: increasing business agility (51%) and introducing new products/services (50%). Similarly, in the “IDC Vertical IT and Communications Survey,” 34.2% of the 2700 respondents selected  “analysis of operations related data” as their organization’s top driver for using big data technologies and approaches, ahead of “analysis of online customer related behavior data” (28.8%) and “analysis of transactional data from sales systems” (23.8%).

The use of big data technologies to support IT’s traditional role of helping improve operational efficiencies dovetails with the continuing importance of traditional relational data. In answer to the question “What data types are you interested in addressing in your big data projects?” 70% of respondents in the Informatica survey cited “relational data from transaction systems.” Still, there was strong interest in unstructured data (45%), flat files (43%), log data from application, network and web systems (39 percent), social media data (37%), and semi-structured industry data (34%).

The need to handle all these disparate types of data and data sources leads to the top big data-related IT challenge as defined by respondents to the IDC/Computerworld BI and Analytics survey. The more than 100 respondents (presumably mostly BI and Analytics professionals) cited data integration (51.4%) and data quality (47.7%) as their top two challenges. In the Informatica survey, the top challenges were lack of maturity of big data tools (52%), lack of support for real-time data (39%), and poor data quality, security, and privacy (38%).

In terms of business challenges, “defining business requirements” topped the list in the IDC BI and Analytics survey. In the other IDC survey, the top (generic) challenge was defined by most respondents (27%) as “deciding what data was relevant,” while the least cited (8%) was “deciding which technology to use.” More support, in my opinion, for the notion that big data is a business imperative, not just an IT issue, and that just as with business analytics and business intelligence, defining the business requirements (which in turn define what data to use) is the first step on the journey.

But the right people to help embark on this journey are hard to find. “Lack of sufficient stuff with analytics skills” was high (46.8%) on the list of business challenges in the IDC BI/Analytics survey.  Similarly, in answer to a question in the same survey about specific skills sought over the next 12 months, most respondents declared predictive analytics skills—maybe seeing them as a bridge to big data–as their top priority and the most difficult talent to find.

Finally, the fact that almost 18% of respondents in the IDC Vertical IT and Communications Survey answered “None” to the question about their biggest big data challenge, prompted the IDC analysts to speculate that “a lot of people deploying big data don’t know what to use it for” and that “big data is a challenge for the business professional, not the IT professional.”  They also inferred from it that big data may force business process and organizational changes that many people in the organization may be reluctant to embrace.

Big data is both a big opportunity and a big threat. But to do nothing is not an option. To quote IDC’s Ben Woo, “If you’re not going to do big data, your competitors are going to.”