Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Building Polynomial Regression | Section
Supervised Learning Essentials

bookBuilding Polynomial Regression

メニューを表示するにはスワイプしてください

Loading File

We load poly.csv and inspect it:

1234
import pandas as pd file_link = 'https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/poly.csv' df = pd.read_csv(file_link) print(df.head())
copy

Then visualize the relation:

12345
import matplotlib.pyplot as plt X = df['Feature'] y = df['Target'] plt.scatter(X, y) plt.show()
copy

A straight line fits poorly, so Polynomial Regression is more suitable.

Here is the rewritten section adapted for Scikit-learn.

Building Transformed Matrix

To create polynomial features, we could add squared features manually:

df['Feature_squared'] = df['Feature'] ** 2

But for higher degrees, the PolynomialFeatures class from sklearn.preprocessing is much easier and more efficient. It requires a 2-D structure (DataFrame or 2-D array):

from sklearn.preprocessing import PolynomialFeatures

X = df[['Feature']]
# Create the transformer
poly = PolynomialFeatures(degree=2, include_bias=False)
# Transform the data
X_poly = poly.fit_transform(X)

Parameters

The PolynomialFeatures class has several important parameters:

  • degree (default=2): the degree of the polynomial features. For example, if degree=3, it generates .
  • interaction_only (default=False): if True, only interaction features are produced (e.g., ), avoiding terms like .
  • include_bias (default=True): if True, it adds a column of ones (bias column).

Important: since LinearRegression calculates the intercept automatically, we usually set include_bias=False to avoid redundancy.

Building the Polynomial Regression

Once we have the transformed features (X_poly), we can use the standard LinearRegression model.

from sklearn.linear_model import LinearRegression

y = df['Target']

# Initialize and train the model
model = LinearRegression()
model.fit(X_poly, y)

Predicting requires transforming the new data using the same transformer instance before passing it to the model:

# Transform new data
X_new_poly = poly.transform(X_new)
# Predict
y_pred = model.predict(X_new_poly)

Full Example

123456789101112131415161718192021222324252627282930
import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.preprocessing import PolynomialFeatures # Load data df = pd.read_csv(file_link) X = df[['Feature']] y = df['Target'] # 1. Create Polynomial Features n = 2 poly = PolynomialFeatures(degree=n, include_bias=False) X_poly = poly.fit_transform(X) # 2. Train Linear Regression model = LinearRegression() model.fit(X_poly, y) # 3. Predict on new data X_new = np.linspace(-0.1, 1.5, 80).reshape(-1, 1) X_new_poly = poly.transform(X_new) y_pred = model.predict(X_new_poly) # Visualization plt.scatter(X, y, label='Data') plt.plot(X_new, y_pred, color='red', label=f'Degree {n}') plt.legend() plt.show()
copy

Try changing the degree (n) to see how the curve changes. You will notice that higher degrees fit the training data better but might behave erratically outside the range—this leads into the next chapter on Overfitting.

question mark

Consider the following code. In which case will the code run without errors?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  12

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1.  12
some-alt