9 Categories of Data Scientists

DataScience_categories.PNG

DataViz:

  • Those strong in statistics: they sometimes develop new statistical theories for big data, that even traditional statisticians are not aware of. They are expert in statistical modeling, experimental design, sampling, clustering, data reduction, confidence intervals, testing, modeling, predictive modeling and other related techniques.
  • Those strong in mathematics: NSA (national security agency) or defense/military people working on big data, astronomers, and operations research people doing analytic business optimization (inventory management and forecasting, pricing optimization, supply chain, quality control, yield optimization) as they collect, analyse and extract value out of data.
  • Those strong in data engineering, Hadoop, database/memory/file systems optimization and architecture, API’s, Analytics as a Service, optimization of data flows, data plumbing.
  • Those strong in machine learning / computer science (algorithms, computational complexity)
  • Those strong in business, ROI optimization, decision sciences, involved in some of the tasks traditionally performed by business analysts in bigger companies (dashboards design, metric mix selection and metric definitions, ROI optimization, high-level database design)
  • Those strong in production code development, software engineering (they know a few programming languages)
  • Those strong in visualization
  • Those strong in GIS, spatial data, data modeled by graphs, graph databases
  • Those strong in a few of the above. After 20 years of experience across many industries, big and small companies (and lots of training), I’m strong both in stats, machine learning, business, mathematics and more than just familiar with visualization and data engineering. This could happen to you as well over time, as you build experience. I mention this because so many people still think that it is not possible to develop a strong knowledge base across multiple domains that are traditionally perceived as separated (the silo mentality). Indeed, that’s the very reason why data science was created.