- (4.0)
- | 1860 Ratings

In today’s article, we will discuss **Ridge regression** which is one of the standard regression models that an individual can avail to analyze the data in detail. Further, the regression model is explained with the help of the formula and example.

Though **linear regression** and **logistic regression** are the most beloved members of the **regression** family, according to a record-talk at NYC Data Science Academy, you have to be very special to use regression without regularization.

Ridge regression is one of the most fundamental regularization technique which is not used by many due to the complex science behind it. If you have an overall idea about the concept of multiple regression, it’s not so difficult to explore the science behind Ridge regression. When the overall idea about regression is same, what makes regularization different is the way how the model coefficients are determined.

The Ridge regression is a technique which is specialized to analyze multiple regression data which is multicollinearity in nature.

The term **multicollinearity** also refers to collinearity concept in statistics. In this phenomenon, one predicted value in multiple regression models is linearly predicted with others to attain a certain level of accuracy.

The concept multicollinearity occurs when there are high co-relations between more than two predicted variables.

For example A person’s height, weight, age, annual income etc.

Ridge regression is used to create a **parsimonious model** in the following scenarios.

- The number of predictor variables in a given set exceeds the number of observations
- The dataset has multicollinearity (that is correlations between predictor variables).

The regularization techniques are as follows.

- Penalize the magnitude of coefficients of features
- Minimize the error between the actual and predicted observations

Though there are two regularization techniques – **Ridge regression** and **Lasso regression** for creating **parsimonious models** with a large number of features, the practical use, and the inherent properties are completely different.

Ridge regression performs L2 regularization. Here the penalty equivalent is added to the square of the magnitude of coefficients. The minimization objective is as followed.

Taking a response vector y ∈ Rn and a predictor matrix X ∈ Rn×p, the ridge regression coefficients are defined as

Here λ is the turning factor that controls the strength of the penalty term.

If λ = 0, the objective becomes similar to simple linear regression. So we get the same coefficients as simple linear regression.

If λ = ∞, the coefficients will be zero because of infinite weightage on the square of coefficients as anything less than zero makes the objective infinite.

If 0 < λ < ∞, the magnitude of λ decides the weightage given to the different parts of the objective.

In simple terms, the minimization objective = LS Obj + λ (sum of the square of coefficients)

Where LS Obj is Least Square Objective that is the linear regression objective without regularization.

As ridge regression shrinks the coefficients towards zero, it introduces some bias. But it can reduce the variance to a great extent which will result in a better mean-squared error. The amount of shrinkage is controlled by λ which multiplies the ridge penalty. As large λ means more shrinkage, we can get different coefficient estimates for the different values of λ.

**Example:**

For example, ridge regression can be used for the analysis of prostate-specific antigen and clinical measures among people who were about to have their prostates removed.

The performance of ridge regression is good when there is a subset of true coefficients which are small or even zero. But it doesn’t give good results when all the true coefficients are moderately large. However, it can still perform linear regression over a narrow range of (small) λ values.

So we have talked about **ridge regression** model and also understood the concept of multicollinearity and how it is used in a ridge regression model analysis. If you have any suggestions on this topic then please advise the same in the comments section below so that others can avail the opportunity to gain complete knowledge about ridge regression.

2357 Enrolled

3370 Enrolled

1324 Enrolled

2103 Enrolled

2925 Enrolled

4025 Enrolled

1112 Enrolled

1578 Enrolled

2178 Enrolled

1483 Enrolled

1090 Enrolled

3063 Enrolled

921 Enrolled

1899 Enrolled

3402 Enrolled

3766 Enrolled

1458 Enrolled

3819 Enrolled

1385 Enrolled

289 Enrolled

Get Updates on Tech posts, Interview & Certification questions and training schedules