Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Predicting House Prices | Section
Supervised Learning Essentials

bookChallenge: Predicting House Prices

You will now build a real-world example regression model. You have a file, houses_simple.csv, that holds information about housing prices with its area as a feature.

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
copy

The next step is to assign variables and visualize the dataset:

123456789
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df['square_feet'] y = df['price'] plt.scatter(X, y, alpha=0.5) plt.show()
copy

In the example with a person's height, it was much easier to imagine a line fitting the data well.

But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc. Anyway, the task is to build the line that best fits the data we have; it will show the trend. The LinearRegression class from scikit-learn should be used for that.

Task

Swipe to start coding

  1. Assign the 'price' column of df to y.
  2. Create the X_reshaped variable by reshaping X into a 2D array using .values.reshape(-1, 1).
  3. Initialize the LinearRegression model and train it using X_reshaped and y.
  4. Create X_new_reshaped by reshaping X_new the same way.
  5. Predict the target for X_new_reshaped.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 4
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

bookChallenge: Predicting House Prices

Swipe to show menu

You will now build a real-world example regression model. You have a file, houses_simple.csv, that holds information about housing prices with its area as a feature.

1234
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
copy

The next step is to assign variables and visualize the dataset:

123456789
import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df['square_feet'] y = df['price'] plt.scatter(X, y, alpha=0.5) plt.show()
copy

In the example with a person's height, it was much easier to imagine a line fitting the data well.

But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc. Anyway, the task is to build the line that best fits the data we have; it will show the trend. The LinearRegression class from scikit-learn should be used for that.

Task

Swipe to start coding

  1. Assign the 'price' column of df to y.
  2. Create the X_reshaped variable by reshaping X into a 2D array using .values.reshape(-1, 1).
  3. Initialize the LinearRegression model and train it using X_reshaped and y.
  4. Create X_new_reshaped by reshaping X_new the same way.
  5. Predict the target for X_new_reshaped.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 4
single

single

some-alt