Lära Clustering Financial Instruments | Machine Learning for FinTech

Python for FinTech

Svep för att visa menyn

Clustering is a powerful unsupervised learning technique that plays an important role in financial analysis. In finance, clustering is often used to group together stocks or other financial instruments that exhibit similar behavior, such as comparable return profiles, volatility patterns, or risk exposures. By identifying these groups, you can uncover hidden structure within the market, discover relationships between assets, and make more informed decisions about portfolio construction or risk management.


              123456789101112131415161718
            
import numpy as np
from sklearn.cluster import KMeans

# Hardcoded daily returns for 6 assets (rows: assets, columns: days)
returns = np.array([
    [0.01, 0.012, 0.011, 0.013, 0.012],
    [0.009, 0.008, 0.01, 0.011, 0.009],
    [0.03, 0.028, 0.031, 0.027, 0.029],
    [0.031, 0.032, 0.028, 0.029, 0.03],
    [-0.01, -0.012, -0.011, -0.013, -0.012],
    [-0.009, -0.008, -0.01, -0.011, -0.009]
])

# Cluster the assets into 3 groups using KMeans
kmeans = KMeans(n_clusters=3, random_state=42)
labels = kmeans.fit_predict(returns)

print("Cluster labels for each asset:", labels)

After clustering, you will have assigned each asset to a group based on the similarity of their return patterns. Interpreting these cluster results can provide valuable insights. For example, assets within the same cluster may respond similarly to market events or share underlying risk factors. In portfolio construction, clustering helps you avoid over-concentration in highly similar assets and encourages diversification by selecting from different clusters. This approach can reduce portfolio risk and improve long-term performance by spreading exposure across distinct market behaviors.


              12345678910111213141516
            
import pandas as pd

asset_names = ['Asset_A', 'Asset_B', 'Asset_C', 'Asset_D', 'Asset_E', 'Asset_F']
# Assign cluster labels to a DataFrame
df = pd.DataFrame(returns, columns=[f"Day_{i+1}" for i in range(returns.shape[1])])
df['Asset'] = asset_names
df['Cluster'] = labels

# Summarize cluster characteristics
cluster_summary = df.groupby('Cluster').mean(numeric_only=True)
print("Cluster summary (average returns for each cluster):")
print(cluster_summary)
print("\nAssets in each cluster:")
for cluster in sorted(df['Cluster'].unique()):
    assets_in_cluster = df[df['Cluster'] == cluster]['Asset'].tolist()
    print(f"Cluster {cluster}: {assets_in_cluster}")

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 3. Kapitel 4

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 3. Kapitel 4

Clustering Financial Instruments

1. What is the purpose of clustering in financial analysis?

2. Which scikit-learn class is used for KMeans clustering?

3. How can clustering help in portfolio diversification?