Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen Building Polynomial Regression | Section
Regression with Python

bookBuilding Polynomial Regression

Loading File

We load poly.csv and inspect it:

1234
import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
copy

Then visualize the relation:

12345
import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
copy

A straight line fits poorly, so Polynomial Regression is more suitable.

Building X̃ Matrix

To create , we could add squared features manually:

df['Feature_squared'] = df['Feature'] ** 2

But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:

from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)

It also adds the constant column, so no sm.add_constant() needed.

If X is 1-D, convert it:

X = X.reshape(-1, 1)

Building the Polynomial Regression

import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()

Predicting requires transforming new data the same way:

X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)

Full Example

123456789101112131415161718
import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
copy

Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.

question mark

Consider the following code. In which case will the code run without errors?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 13

Fragen Sie AI

expand

Fragen Sie AI

ChatGPT

Fragen Sie alles oder probieren Sie eine der vorgeschlagenen Fragen, um unser Gespräch zu beginnen

bookBuilding Polynomial Regression

Swipe um das Menü anzuzeigen

Loading File

We load poly.csv and inspect it:

1234
import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
copy

Then visualize the relation:

12345
import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
copy

A straight line fits poorly, so Polynomial Regression is more suitable.

Building X̃ Matrix

To create , we could add squared features manually:

df['Feature_squared'] = df['Feature'] ** 2

But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:

from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)

It also adds the constant column, so no sm.add_constant() needed.

If X is 1-D, convert it:

X = X.reshape(-1, 1)

Building the Polynomial Regression

import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()

Predicting requires transforming new data the same way:

X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)

Full Example

123456789101112131415161718
import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
copy

Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.

question mark

Consider the following code. In which case will the code run without errors?

Select the correct answer

War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 13
some-alt