Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Building Polynomial Regression | Section
Practice
Projects
Quizzes & Challenges
Quizzen
Challenges
/
Regression with Python

bookBuilding Polynomial Regression

Loading File

We load poly.csv and inspect it:

1234
import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
copy

Then visualize the relation:

12345
import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
copy

A straight line fits poorly, so Polynomial Regression is more suitable.

Building X̃ Matrix

To create , we could add squared features manually:

df['Feature_squared'] = df['Feature'] ** 2

But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:

from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)

It also adds the constant column, so no sm.add_constant() needed.

If X is 1-D, convert it:

X = X.reshape(-1, 1)

Building the Polynomial Regression

import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()

Predicting requires transforming new data the same way:

X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)

Full Example

123456789101112131415161718
import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
copy

Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.

question mark

Consider the following code. In which case will the code run without errors?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 13

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

bookBuilding Polynomial Regression

Veeg om het menu te tonen

Loading File

We load poly.csv and inspect it:

1234
import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
copy

Then visualize the relation:

12345
import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
copy

A straight line fits poorly, so Polynomial Regression is more suitable.

Building X̃ Matrix

To create , we could add squared features manually:

df['Feature_squared'] = df['Feature'] ** 2

But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:

from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)

It also adds the constant column, so no sm.add_constant() needed.

If X is 1-D, convert it:

X = X.reshape(-1, 1)

Building the Polynomial Regression

import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()

Predicting requires transforming new data the same way:

X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)

Full Example

123456789101112131415161718
import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
copy

Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.

question mark

Consider the following code. In which case will the code run without errors?

Select the correct answer

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 13
some-alt