Building The Linear Regression Using Statsmodels

Course Content

Linear Regression with Python

Linear Regression with Python

Building The Linear Regression Using Statsmodels

In the previous chapter, we used a function from NumPy to calculate the parameters.
Now we will use the class object instead of the function to represent the linear regression. This approach takes more lines of code to find the parameters, but it stores a lot of helpful information inside the object and makes the prediction more straightforward.

Building a Linear Regression model

In statsmodels, the `OLS` class can be used to create a linear regression model.

We first need to initialize an `OLS` class object using `sm.OLS(y, X_tilde)`. Then train it using the `fit()` method.

Which is equivalent to:

Note

The constructor of the `OLS` class expects a specific array `X_tilde` as an input, which we saw in the Normal Equation. So you need to convert your `X` array to `X_tilde`. This is achievable using the `sm.add_constant()` function.

Finding parameters

When the model is trained, you can easily access the parameters using the `params` attribute.

Making the predictions

New instances can easily be predicted using `predict()` method, but you need to preprocess the input for them too:

Getting the summary

As you probably noticed, using the `OLS` class is not as easy as the `polyfit()` function. But using `OLS` has its benefits. While training, it calculates a lot of statistical information. You can access the information using the `summary()` method.

That's a lot of statistics. We will discuss the table's most important parts in later sections.