Note
Go to the end to download the full example code.
3.4.8.7. Plot variance and regularization in linear modelsΒΆ
import numpy as np
# Smaller figures
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (3, 2)
We consider the situation where we have only 2 data point
Without noise, as linear regression fits the data perfectly
[<matplotlib.lines.Line2D object at 0x7fe9ca800c50>]
In real life situation, we have noise (e.g. measurement noise) in our data:
As we can see, our linear model captures and amplifies the noise in the data. It displays a lot of variance.
We can use another linear estimator that uses regularization, the
Ridge
estimator. This estimator
regularizes the coefficients by shrinking them to zero, under the
assumption that very high correlations are often spurious. The alpha
parameter controls the amount of shrinkage used.
regr = linear_model.Ridge(alpha=0.1)
np.random.seed(0)
for _ in range(6):
noisy_X = X + np.random.normal(loc=0, scale=0.1, size=X.shape)
plt.plot(noisy_X, y, "o")
regr.fit(noisy_X, y)
plt.plot(X_test, regr.predict(X_test))
plt.show()
Total running time of the script: (0 minutes 0.115 seconds)