Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Comparing the Dynamics | K-Medoids Algorithm
Cluster Analysis in Python

book
Comparing the Dynamics

That's an interesting result! The yearly average temperatures across clusters significantly differ for 3 of them (47.3, 60.9, and 79.24). It seems like a good split.

Now let's visualize the monthly dynamics of average temperatures across clusters, and compare the result with the 5 clusters by the K-Means algorithm. The respective line plot is below.

Taak

Swipe to start coding

Visualize the monthly temperature dynamics across clusters. Follow the next steps:

  1. Import KMedoids function from sklearn_extra.cluster.
  2. Create a KMedoids object named model with 4 clusters.
  3. Fit the 3-15 columns (these are not indices, but positions) of data to model.
  4. Add the 'prediction' column to data with predicted by model labels.
  5. Calculate the monthly averages using data and save the result within the d DataFrame:
  • Group the observations by the 'prediction' column.
  • Calculate the mean values.
  • Stack the columns into indices (already done).
  • Reset the indices.
  1. Assign ['Group', 'Month', 'Temp'] as columns names of d.
  2. Build lineplot with 'Month' on the x-axis, 'Temp' on the y-axis for each 'Group' of d DataFrame (i.e. separate line and color for each 'Group').

Oplossing

# Import the libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn_extra.cluster import KMedoids

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create model
model = KMedoids(n_clusters = 4)

# Fit the data to model
model.fit(data.iloc[:,2:-1])

# Add new column to DataFrame
data['prediction'] = model.predict(data.iloc[:,2:-1])

# Extract the list of the columns
col = list(data.columns[2:14])
col.append('prediction')

# Calculate the monthly mean averages for each cluster
d = data[col].groupby('prediction').mean().stack().reset_index()

# Assign new column names
d.columns = ['Group', 'Month', 'Temp']

# Visualize the results
sns.lineplot(x = 'Month', y = 'Temp', hue = 'Group', data = d)
plt.show()

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 2. Hoofdstuk 6
single

single

# Import the libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from ___ import ___

# Read the data
data = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/138ab9ad-aa37-4310-873f-0f62abafb038/Cities+weather.csv', index_col = 0)

# Create model
model = ___(___)

# Fit the data to model
___.___(data.iloc[:,2:-1])

# Add new column to DataFrame
data['prediction'] = ___.___(data.iloc[:,2:-1])

# Extract the list of the columns
col = list(data.columns[2:14])
col.append('prediction')

# Calculate the monthly mean averages for each cluster
d = data[col].___('prediction').___().stack().___()

# Assign new column names
d.___ = ['Group', 'Month', 'Temp']

# Visualize the results
sns.___(x = '___', y = '___', hue = '___', data = ___)
___

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

We use cookies to make your experience better!
some-alt