Here’s a simple rule for the second machine age we’re in now: as the amount of data goes up, the importance of human judgment should go down… The practical conclusion is that we should turn many of our decisions, predictions, diagnoses, and judgments—both the trivial and the consequential—over to the algorithms. There’s just no controversy any more about whether doing so will give us better results… I don’t know how quickly it’ll happen, but I’m very confident that data-dominated firms are going to take market share, customers, and profits away from those who are still relying too heavily on their human experts.
–Andrew McAfee, Harvard Business Review Blog
Human judgment is at the center of successful data analysis. This statement might initially seem at odds with the current Big Data frenzy and its focus on data management and machine learning methods. But while these tools provide immense value, it is important to remember that they are just that: tools. A hammer does not a carpenter make — though it certainly helps. Consider the words of John Tukey, possibly the greatest statistician of the last half-century: “Nothing — not the careful logic of mathematics, not statistical models and theories, not the awesome arithmetic power of modern computers — nothing can substitute here for the flexibility of the informed human mind. Accordingly, both approaches and techniques need to be structured so as to facilitate human involvement and intervention.” Tukey goes on to write: “Some implications for effective data analysis are: (1) that it is essential to have convenience of interaction of people and intermediate results and (2) that at all stages of data analysis the nature and detail of output need to be matched to the capabilities of the people who use it and want it.” Though Tukey and colleagues voiced these sentiments nearly 50 years ago, they ring even more true today. The interested analyst is at the heart of the Big Data question: how well do our tools help users ask better questions, formulate hypotheses, spot anomalies, correct errors and create improved models and visualizations? To “facilitate human involvement” across “all stages of data analysis” is a grand challenge for our age.
–Jeffrey Heer, O’Reilly Data