Challenge: Predicting House Prices
You will now build a real-world example regression model. You have a file, houses_simple.csv, that holds information about housing prices with its area as a feature.
1234import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
The next step is to assign variables and visualize the dataset:
123456789import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df['square_feet'] y = df['price'] plt.scatter(X, y, alpha=0.5) plt.show()
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will show the trend. The LinearRegression class from scikit-learn should be used for that.
Swipe to start coding
- Assign the
'price'column ofdftoy. - Create the
X_reshapedvariable by reshapingXinto a 2D array using.values.reshape(-1, 1). - Initialize the
LinearRegressionmodel and train it usingX_reshapedandy. - Create
X_new_reshapedby reshapingX_newthe same way. - Predict the target for
X_new_reshaped.
Solution
Thanks for your feedback!
single
Ask AI
Ask AI
Ask anything or try one of the suggested questions to begin our chat
Awesome!
Completion rate improved to 3.33
Challenge: Predicting House Prices
Swipe to show menu
You will now build a real-world example regression model. You have a file, houses_simple.csv, that holds information about housing prices with its area as a feature.
1234import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') print(df.head())
The next step is to assign variables and visualize the dataset:
123456789import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/b22d1166-efda-45e8-979e-6c3ecfc566fc/houses_simple.csv') X = df['square_feet'] y = df['price'] plt.scatter(X, y, alpha=0.5) plt.show()
In the example with a person's height, it was much easier to imagine a line fitting the data well.
But now our data has much more variance since the target highly depends on many other things like age, location, interior, etc.
Anyway, the task is to build the line that best fits the data we have; it will show the trend. The LinearRegression class from scikit-learn should be used for that.
Swipe to start coding
- Assign the
'price'column ofdftoy. - Create the
X_reshapedvariable by reshapingXinto a 2D array using.values.reshape(-1, 1). - Initialize the
LinearRegressionmodel and train it usingX_reshapedandy. - Create
X_new_reshapedby reshapingX_newthe same way. - Predict the target for
X_new_reshaped.
Solution
Thanks for your feedback!
single