Logistic Regression in R: A Classification Technique to Predict Credit Card Default

Logistic regression is one of the statistical techniques in machine learning used to form prediction models. It is one of the most popular classification algorithms mostly used for binary classification problems (problems with two class values; however, some variants may deal with multiple classes as well). It’s used for various research and industrial problems. Therefore, it is essential to have a good grasp on the logistic regression algorithm. This tutorial is a sneak peek from many of Data Science Dojo’s hands-on exercises from their 5-day data science bootcamp, you will learn how logistic regression fits a dataset to make predictions, as well as when and why to use it.

In short, Logistic Regression is used when the dependent variable (target) is categorical. For example:

  • To predict whether an email is spam (1) or not spam (0)
  • Whether the tumor is malignant (1) or not (0)

It is named as ‘Logistic Regression’, because it’s underlying technique is quite the same as Linear Regression. There are structural differences in how linear and logistic regression operate. Therefore, linear regression isn’t suitable to be used for classification problems. This link answers in detail why linear regression isn’t the right approach for classification.

Its name is derived from one of the core functions behind its implementation called the logistic function or the sigmoid function. It’s an S-shaped curve that can take any real-valued number and map it into a value between 0 and 1, but never exactly at those limits.

For complete tutorial click here

Sponsored by Data Science Dojo