Translating Between Computer Science and Statistics

Terence Parr: “I am a computer scientist retooling as a machine learning droid and have found the nomenclature used by statisticians to be peculiar to say the least, so I thought I’d put this document together. It’s meant as good-natured teasing of my friends who are statisticians, but it might actually be useful to other computer scientists. I look forward to a corresponding document written by the statisticians about computer science terms!” (Statisticians say the darndest things)

I know of at least one corresponding document, published in 1994 with the rise of Neural Networks or what I have called Statistics on Steroids (SOS), which are responsible, to a large extent, to the success of today’s “AI” or Deep Learning, an advanced version of machine learning.

In Neural Networks and Statistical Models (1994), Warren Sarle explained to his worried and confused fellow statisticians that the ominous-sounding artificial neural networks

are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software… like many statistical methods, [artificial neural networks] are capable of processing vast amounts of data and making predictions that are sometimes surprisingly accurate; this does not make them “intelligent” in the usual sense of the word. Artificial neural networks “learn” in much the same way that many statistical algorithms do estimation, but usually much more slowly than statistical algorithms. If artificial neural networks are intelligent, then many statistical methods must also be considered intelligent.

Sarle provided his colleagues with a handy dictionary translating the terms used by “neural engineers” to the language of statisticians (e.g., “features” are “variables”). In anticipation of today’s “data science” and predictions of algorithms replacing statisticians (and even scientists), Sarle reassured them that no “black box” can substitute for human intelligence:

Neural engineers want their networks to be black boxes requiring no human intervention—data in, predictions out. The marketing hype claims that neural networks can be used with no experience and automatically learn whatever is required; this, of course, is nonsense. Doing a simple linear regression requires a nontrivial amount of statistical expertise.

See here for a discussion of the larger historical context and A Very Short History of Data Science