Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer February vs July Average Temperatures | K-Means Algorithm
Cluster Analysis in Python

book
February vs July Average Temperatures

Well, as you remember, there are no 100% correct answers to clustering problems. For the last task you solved it seems like 5 clusters might be a good option.

Let's visualize the results of clustering into 5 groups by building the scatter plot for average February vs July temperatures, which are one of the coldest and hottest months respectively.

Taak

Swipe to start coding

Table
  1. Create a KMeans model named model with 5 clusters.
  2. Fit the numerical columns of data (2 - 13 indices) to model.
  3. Add the 'prediction' column to the data DataFrame with predicted by model labels.
  4. Build a scatter plot of average 'Feb' vs 'Jul' temperatures, having each point colored with respect to the 'prediction' column of the data DataFrame.

Oplossing

# Import the libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create KMeans model
model = KMeans(n_clusters = 5)

# Fit the data to model
model.fit(data.iloc[:,2:14])

# Predict the labels
data['prediction'] = model.predict(data.iloc[:,2:14])

# Visualize the results
sns.scatterplot(x = 'Feb', y = 'Jul', hue = 'prediction', data = data)
plt.show()

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 7
# Import the libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create KMeans model
model = ___(___ = ___)

# Fit the data to model
___.___(data.iloc[:,2:14])

# Predict the labels
data['prediction'] = ___.___(data.___[___])

# Visualize the results
sns.scatterplot(x = '___', y = '___', hue = '___', data = ___)
plt.show()
toggle bottom row
some-alt