Statistics and Machine Learning

data-mining-venn-diagram

The image above is taken from a data mining primer course SAS offered in 1998.

Aatash Shah, CEO of Edvancer Eduventures, in KDnuggets:

Machine learning requires no prior assumptions about the underlying relationships between the variables. You just have to throw in all the data you have, and the algorithm processes the data and discovers patterns, using which you can make predictions on the new data set. Machine learning treats an algorithm like a black box, as long it works. It is generally applied to high dimensional data sets, the more data you have, the more accurate your prediction is.

In contrast, statisticians must understand how the data was collected, statistical properties of the estimator (p-value, unbiased estimators), the underlying distribution of the population they are studying and the kinds of properties you would expect if you did the experiment many times. You need to know precisely what you are doing and come up with parameters that will provide the predictive power. Statistical modeling techniques are usually applied to low dimensional data sets.

 

Advertisements

About GilPress

I launched the Big Data conversation; writing, research, marketing services; https://whatsthebigdata.com/ & http://infostory.com/
This entry was posted in Machine Learning, Statistics and tagged . Bookmark the permalink.

One Response to Statistics and Machine Learning

  1. Pingback: Statistics and Machine Learning | A bunch of data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s