Evaluate the Model
In this challenge, you are given the good old housing dataset, but this time only with the 'age' feature.
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_poly.csv') print(df.head())
Let's build a scatterplot of this data.
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_poly.csv') X = df['age'] y = df['price'] plt.scatter(X, y, alpha=0.4) plt.show()
Fitting a straight line to this data may not be a great choice. The price gets higher for either brand-new or really old houses. Fitting a parabola looks like a better choice. And that's what you will do in this challenge.
But before you start, recall the PolynomialFeatures
class.
The fit_transform(X)
method requires X
to be a 2-D array (or a DataFrame).
Using X = df[['column_name']]
will get your X
suited for fit_transform()
.
And if you have a 1-D array, use .reshape(-1, 1)
to make a 2-D array with the same contents.
The task is to build a Polynomial Regression of degree 2 using PolynomialFeatures
and OLS
.
Swipe to start coding
- Assign the
X
variable to a DataFrame containing column'age'
. - Create an
X_tilde
matrix using thePolynomialFeatures
class. - Build and train a Polynomial Regression model.
- Reshape
X_new
to be a 2-D array. - Preprocess
X_new
the same way asX
. - Print the model's parameters.
Løsning
Takk for tilbakemeldingene dine!