Clustering Financial Instruments
Clustering is a powerful unsupervised learning technique that plays an important role in financial analysis. In finance, clustering is often used to group together stocks or other financial instruments that exhibit similar behavior, such as comparable return profiles, volatility patterns, or risk exposures. By identifying these groups, you can uncover hidden structure within the market, discover relationships between assets, and make more informed decisions about portfolio construction or risk management.
123456789101112131415161718import numpy as np from sklearn.cluster import KMeans # Hardcoded daily returns for 6 assets (rows: assets, columns: days) returns = np.array([ [0.01, 0.012, 0.011, 0.013, 0.012], [0.009, 0.008, 0.01, 0.011, 0.009], [0.03, 0.028, 0.031, 0.027, 0.029], [0.031, 0.032, 0.028, 0.029, 0.03], [-0.01, -0.012, -0.011, -0.013, -0.012], [-0.009, -0.008, -0.01, -0.011, -0.009] ]) # Cluster the assets into 3 groups using KMeans kmeans = KMeans(n_clusters=3, random_state=42) labels = kmeans.fit_predict(returns) print("Cluster labels for each asset:", labels)
After clustering, you will have assigned each asset to a group based on the similarity of their return patterns. Interpreting these cluster results can provide valuable insights. For example, assets within the same cluster may respond similarly to market events or share underlying risk factors. In portfolio construction, clustering helps you avoid over-concentration in highly similar assets and encourages diversification by selecting from different clusters. This approach can reduce portfolio risk and improve long-term performance by spreading exposure across distinct market behaviors.
12345678910111213141516import pandas as pd asset_names = ['Asset_A', 'Asset_B', 'Asset_C', 'Asset_D', 'Asset_E', 'Asset_F'] # Assign cluster labels to a DataFrame df = pd.DataFrame(returns, columns=[f"Day_{i+1}" for i in range(returns.shape[1])]) df['Asset'] = asset_names df['Cluster'] = labels # Summarize cluster characteristics cluster_summary = df.groupby('Cluster').mean(numeric_only=True) print("Cluster summary (average returns for each cluster):") print(cluster_summary) print("\nAssets in each cluster:") for cluster in sorted(df['Cluster'].unique()): assets_in_cluster = df[df['Cluster'] == cluster]['Asset'].tolist() print(f"Cluster {cluster}: {assets_in_cluster}")
1. What is the purpose of clustering in financial analysis?
2. Which scikit-learn class is used for KMeans clustering?
3. How can clustering help in portfolio diversification?
Tack för dina kommentarer!
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal
Fantastiskt!
Completion betyg förbättrat till 4.76
Clustering Financial Instruments
Svep för att visa menyn
Clustering is a powerful unsupervised learning technique that plays an important role in financial analysis. In finance, clustering is often used to group together stocks or other financial instruments that exhibit similar behavior, such as comparable return profiles, volatility patterns, or risk exposures. By identifying these groups, you can uncover hidden structure within the market, discover relationships between assets, and make more informed decisions about portfolio construction or risk management.
123456789101112131415161718import numpy as np from sklearn.cluster import KMeans # Hardcoded daily returns for 6 assets (rows: assets, columns: days) returns = np.array([ [0.01, 0.012, 0.011, 0.013, 0.012], [0.009, 0.008, 0.01, 0.011, 0.009], [0.03, 0.028, 0.031, 0.027, 0.029], [0.031, 0.032, 0.028, 0.029, 0.03], [-0.01, -0.012, -0.011, -0.013, -0.012], [-0.009, -0.008, -0.01, -0.011, -0.009] ]) # Cluster the assets into 3 groups using KMeans kmeans = KMeans(n_clusters=3, random_state=42) labels = kmeans.fit_predict(returns) print("Cluster labels for each asset:", labels)
After clustering, you will have assigned each asset to a group based on the similarity of their return patterns. Interpreting these cluster results can provide valuable insights. For example, assets within the same cluster may respond similarly to market events or share underlying risk factors. In portfolio construction, clustering helps you avoid over-concentration in highly similar assets and encourages diversification by selecting from different clusters. This approach can reduce portfolio risk and improve long-term performance by spreading exposure across distinct market behaviors.
12345678910111213141516import pandas as pd asset_names = ['Asset_A', 'Asset_B', 'Asset_C', 'Asset_D', 'Asset_E', 'Asset_F'] # Assign cluster labels to a DataFrame df = pd.DataFrame(returns, columns=[f"Day_{i+1}" for i in range(returns.shape[1])]) df['Asset'] = asset_names df['Cluster'] = labels # Summarize cluster characteristics cluster_summary = df.groupby('Cluster').mean(numeric_only=True) print("Cluster summary (average returns for each cluster):") print(cluster_summary) print("\nAssets in each cluster:") for cluster in sorted(df['Cluster'].unique()): assets_in_cluster = df[df['Cluster'] == cluster]['Asset'].tolist() print(f"Cluster {cluster}: {assets_in_cluster}")
1. What is the purpose of clustering in financial analysis?
2. Which scikit-learn class is used for KMeans clustering?
3. How can clustering help in portfolio diversification?
Tack för dina kommentarer!