Translating Between Computer Science and Statistics

Terence Parr: “I am a computer scientist retooling as a machine learning droid and have found the nomenclature used by statisticians to be peculiar to say the least, so I thought I’d put this document together. It’s meant as good-natured teasing of my friends who are statisticians, but it might actually be useful to other computer scientists. I look forward to a corresponding document written by the statisticians about computer science terms!” (Statisticians say the darndest things)

I know of at least one corresponding document, published in 1994 with the rise of Neural Networks or what I have called Statistics on Steroids (SOS), which are responsible, to a large extent, to the success of today’s “AI” or Deep Learning, an advanced version of machine learning.

In Neural Networks and Statistical Models (1994), Warren Sarle explained to his worried and confused fellow statisticians that the ominous-sounding artificial neural networks

are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software… like many statistical methods, [artificial neural networks] are capable of processing vast amounts of data and making predictions that are sometimes surprisingly accurate; this does not make them “intelligent” in the usual sense of the word. Artificial neural networks “learn” in much the same way that many statistical algorithms do estimation, but usually much more slowly than statistical algorithms. If artificial neural networks are intelligent, then many statistical methods must also be considered intelligent.

Sarle provided his colleagues with a handy dictionary translating the terms used by “neural engineers” to the language of statisticians (e.g., “features” are “variables”). In anticipation of today’s “data science” and predictions of algorithms replacing statisticians (and even scientists), Sarle reassured them that no “black box” can substitute for human intelligence:

Neural engineers want their networks to be black boxes requiring no human intervention—data in, predictions out. The marketing hype claims that neural networks can be used with no experience and automatically learn whatever is required; this, of course, is nonsense. Doing a simple linear regression requires a nontrivial amount of statistical expertise.

See here for a discussion of the larger historical context and A Very Short History of Data Science

Advertisement

About GilPress

I launched the Big Data conversation; writing, research, marketing services; https://whatsthebigdata.com/ & http://infostory.com/
This entry was posted in AI, Data Science, deep learning, Statistics. Bookmark the permalink.

5 Responses to Translating Between Computer Science and Statistics

  1. Very beautiful, read this morning two of your articles
    #See here for a discussion of the larger historical context and A Very Short History of Data Science”
    sending an e-mail to continue.

    Like

  2. Pingback: Data Science & Analytics: What do Data Scientists Do in 2020 and a Pioneer Practitioner’s Portfolio of Algorithm-based Decision Support Systems for Operations Management in Several Industrial Verticals | Leaders in Pharmaceutical Business Intell

  3. Pingback: Data Science & Analytics: What do Data Scientists Do in 2020 and a Pioneer Practitioner’s Portfolio of Algorithm-based Decision Support Systems for Operations Management in Several Industrial Verticals | Leaders in Pharmaceutical Business Intell

  4. Pingback: Translating Between Computer Science and Statistics — What’s The Big Data? – Ingram System Solutions And Supply.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s