Building Polynomial Regression
Loading File
We load poly.csv and inspect it:
1234import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
Then visualize the relation:
12345import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
A straight line fits poorly, so Polynomial Regression is more suitable.
Building X̃ Matrix
To create X̃, we could add squared features manually:
df['Feature_squared'] = df['Feature'] ** 2
But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:
from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)
It also adds the constant column, so no sm.add_constant() needed.
If X is 1-D, convert it:
X = X.reshape(-1, 1)
Building the Polynomial Regression
import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()
Predicting requires transforming new data the same way:
X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)
Full Example
123456789101112131415161718import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.
Takk for tilbakemeldingene dine!
Spør AI
Spør AI
Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår
Fantastisk!
Completion rate forbedret til 6.67
Building Polynomial Regression
Sveip for å vise menyen
Loading File
We load poly.csv and inspect it:
1234import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
Then visualize the relation:
12345import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
A straight line fits poorly, so Polynomial Regression is more suitable.
Building X̃ Matrix
To create X̃, we could add squared features manually:
df['Feature_squared'] = df['Feature'] ** 2
But for higher degrees, PolynomialFeatures is easier. It requires a 2-D structure:
from sklearn.preprocessing import PolynomialFeatures
X = df[['Feature']]
poly = PolynomialFeatures(n)
X_tilde = poly.fit_transform(X)
It also adds the constant column, so no sm.add_constant() needed.
If X is 1-D, convert it:
X = X.reshape(-1, 1)
Building the Polynomial Regression
import statsmodels.api as sm
y = df['Target']
X = df[['Feature']]
X_tilde = PolynomialFeatures(n).fit_transform(X)
model = sm.OLS(y, X_tilde).fit()
Predicting requires transforming new data the same way:
X_new_tilde = PolynomialFeatures(n).fit_transform(X_new)
y_pred = model.predict(X_new_tilde)
Full Example
123456789101112131415161718import pandas as pd, numpy as np, matplotlib.pyplot as plt import statsmodels.api as sm from sklearn.preprocessing import PolynomialFeatures df = pd.read_csv(file_link) n = 2 X = df[['Feature']] y = df['Target'] X_tilde = PolynomialFeatures(n).fit_transform(X) model = sm.OLS(y, X_tilde).fit() X_new = np.linspace(-0.1, 1.5, 80).reshape(-1,1) X_new_tilde = PolynomialFeatures(n).fit_transform(X_new) y_pred = model.predict(X_new_tilde) plt.scatter(X, y) plt.plot(X_new, y_pred) plt.show()
Try different n values to see how the curve changes and how predictions behave outside the original feature range—this leads into the next chapter.
Takk for tilbakemeldingene dine!