The New Data Scientist Venn Diagram


Stephan Kolassa on StackExchange:

I still think that Hacking Skills, Math & Statistics Knowledge and Substantive Expertise (shortened to “Programming”, “Statistics” and “Business” for legibility) are important… but I think that the role of Communication is important, too. All the insights you derive by leveraging your hacking, stats and business expertise won’t make a bit of a difference unless you can communicate them to people who may not have that unique blend of knowledge. You may need to explain your statistical insights to a business manager who needs to be convinced to spend money or change processes. Or to a programmer who doesn’t think statistically.

So here is the new data science Venn diagram, which also includes communication as one indispensable ingredient.


Davenport and Patil describe data scientists as curious, self-directed and innovative, i.e., they are not limited by the tools available and when needed fashion their own tools and even conduct academic- style research. Not surprisingly, people with this combination of skills and characteristics are rare, as rare and as much in demand as the computer programmers in the 1990s.

This rarity and high demand for data science skills has meant that statisticians, machine learners, data miners, data analysts, DBAs as well as quantitative analysts, i.e., people with any data or analytics skills have re-badged themselves as data scientists so that they are more marketable. This is not unlike the pre-Y2K hype when computer operators and users of PCs, re-badged themselves as computer programmers.

The term “data scientist” itself has become so diffuse that it represents anybody from data base administrators to analysts doing simplistic summaries on Excel spreadsheet to data engineers setting up Hadoop infrastructure to advanced analytics practitioners who discover valuable insights from data using existing tools as well as those like the data scientists in Google and Facebook who derive insights from data using their own enhanced toolkit.

So, is the name really relevant? Apparently not, since Google’s career pages advertise for Decision Support Analysts, Statisticians, Quantitative Analysts, and Data Scientists and they all mean the same thing. Over the last 50 years, many people have been working as the data scientists described by Davenport and Patil, discovering insights from large volumes of diverse data using existing tools as well as new tools that they fashioned. They have been labelled statisticians, artificial intelligence researchers, data miners, machine learners, advanced analytics experts and the list goes on.

What is relevant is to understand where an individual’s interest lies in the broad data science church and where the needs of the organisation are. The individual’s interest may be developing innovative algorithms to solve a new problem (the high-end data scientist described by Davenport and Patil), or identifying new business problems that can be solved with existing tools or distributed programming for Hadoop. The key is to match the organisation’s needs with an individual’s interest and not be bothered with the position title or the candidate’s label.

Finally, as for finding this rare species, let me point out that the characteristics of curiosity, self-direction and innovation are required in all scientific research. Fashioning tools to overcome a challenge has always been the hallmark of a research scientist. Didn’t Newton invent infinitesimal calculus when the mathematical tools at his disposal were insufficient to calculate the instantaneous speed? Furthermore, scientific research through PhD ensures that they are able to teach themselves new skills.

So, instead of looking to graduates from the newly designed data science majors, develop your own data scientists by first finding a PhD or Masters in a quantitative science such as physics, mathematics, statistics or computer science and then providing them data, time and autonomy. It worked for LinkedIn with Jonathan Goldman and for many other data-driven companies and it can work for you too!!


About GilPress

I launched the Big Data conversation; writing, research, marketing services; &
This entry was posted in Data Science Careers, Data Scientists. Bookmark the permalink.

6 Responses to The New Data Scientist Venn Diagram

  1. Pingback: The New Data Scientist Venn Diagram | A bunch of data

  2. Pingback: Web Picks (week of 1 August 2016) | DataMiningApps

  3. Pingback: What is Data Science? – The Data Science Project

  4. Pingback: 爱发现 - 发现不一样的世界

  5. Pingback: The Analytics Tool You Don't Learn At School: Listening - The Money Street

  6. Edith Ohri says:

    If I may add a different view though on Data Science, it is in my eyes is more than a collection of field of expertise, it redefines science! The need for a new definition reflects a missing part in the current science model, the missing guiding rules for hypotheses creation and logical testing, before they are submitted to any Statistics testing.
    See my take on the pronciples and rules that stand behind it, at:


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s