Building Polynomial Regression
Loading File
We load poly.csv and inspect it:
1234import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
Then visualize the relation:
12345import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
A straight line fits poorly, so Polynomial Regression is more suitable.
Building X̃ Matrix
To create X̃, we could add squared features manually:
df['Feature_squared'] = df['Feature'] ** 2
But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:
from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)
It also adds the constant column, so no sm.add_constant() needed.
If X is 1-D, convert it:
X = X.reshape(-1, 1)
Building the Polynomial Regression
import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()
Predicting requires transforming new data the same way:
X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)
Full Example
123456789101112131415161718import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.
Bedankt voor je feedback!
Vraag AI
Vraag AI
Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.
Geweldig!
Completion tarief verbeterd naar 6.67
Building Polynomial Regression
Veeg om het menu te tonen
Loading File
We load poly.csv and inspect it:
1234import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
Then visualize the relation:
12345import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
A straight line fits poorly, so Polynomial Regression is more suitable.
Building X̃ Matrix
To create X̃, we could add squared features manually:
df['Feature_squared'] = df['Feature'] ** 2
But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:
from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)
It also adds the constant column, so no sm.add_constant() needed.
If X is 1-D, convert it:
X = X.reshape(-1, 1)
Building the Polynomial Regression
import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()
Predicting requires transforming new data the same way:
X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)
Full Example
123456789101112131415161718import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.
Bedankt voor je feedback!