Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Clustering for Anomaly Detection | Engineering Data Science Applications
Python for Engineers

bookClustering for Anomaly Detection

Clustering is a powerful technique in engineering data science that enables you to uncover patterns and groupings within complex datasets, even when you do not know the underlying structure in advance. In engineering applications, clustering is especially useful for anomaly detection, such as identifying unusual sensor readings that could indicate faulty equipment, abnormal operating conditions, or the need for maintenance. By grouping similar data points together, clustering allows you to spot outliers—those points that do not fit well into any group—which are often the very anomalies engineers need to find.

1234567891011121314151617181920
import numpy as np from sklearn.cluster import KMeans # Simulated vibration sensor data from a machine (in mm/s) # Most readings are normal, but a few are unusually high or low vibration_data = np.array([ [2.1], [2.3], [2.2], [2.0], [2.4], [2.3], [2.2], [2.5], [2.1], [2.2], [8.0], # Possible anomaly (very high) [1.9], [2.0], [2.1], [2.2], [2.3], [2.2], [2.1], [2.4], [2.3], [0.5], # Possible anomaly (very low) [2.2], [2.3], [2.1], [2.4], [2.2] ]) # Cluster into 2 groups (normal and abnormal) kmeans = KMeans(n_clusters=2, random_state=42) labels = kmeans.fit_predict(vibration_data) # Print cluster centers and labels print("Cluster centers:", kmeans.cluster_centers_.flatten()) print("Labels:", labels)
copy

After clustering the vibration sensor data, you can interpret the results by examining the cluster centers and the labels assigned to each data point. The cluster centers represent the typical vibration levels for each group. In this example, you should see one cluster center near the normal operating vibration (around 2.2 mm/s) and another further away, capturing the abnormal values. Data points assigned to the cluster with a center far from the norm may be considered suspicious. By reviewing which points belong to which cluster, you gain insight into which readings are typical and which may indicate a problem with the machine.

123456789101112131415161718192021222324
import numpy as np from sklearn.cluster import KMeans # Same vibration data as before vibration_data = np.array([ [2.1], [2.3], [2.2], [2.0], [2.4], [2.3], [2.2], [2.5], [2.1], [2.2], [8.0], [1.9], [2.0], [2.1], [2.2], [2.3], [2.2], [2.1], [2.4], [2.3], [0.5], [2.2], [2.3], [2.1], [2.4], [2.2] ]) # Fit KMeans kmeans = KMeans(n_clusters=2, random_state=42) labels = kmeans.fit_predict(vibration_data) # Compute distances to assigned cluster center distances = np.abs(vibration_data.flatten() - kmeans.cluster_centers_[labels].flatten()) # Find the indices of the farthest points (potential anomalies) anomaly_indices = distances.argsort()[-2:] # Top 2 farthest points print("Potential anomalies at indices:", anomaly_indices) print("Anomalous vibration values:", vibration_data[anomaly_indices].flatten())
copy

1. What is the purpose of clustering in engineering data analysis?

2. Which scikit-learn class is used for KMeans clustering?

3. How can clustering help identify faulty equipment?

question mark

What is the purpose of clustering in engineering data analysis?

Select the correct answer

question mark

Which scikit-learn class is used for KMeans clustering?

Select the correct answer

question mark

How can clustering help identify faulty equipment?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

bookClustering for Anomaly Detection

Свайпніть щоб показати меню

Clustering is a powerful technique in engineering data science that enables you to uncover patterns and groupings within complex datasets, even when you do not know the underlying structure in advance. In engineering applications, clustering is especially useful for anomaly detection, such as identifying unusual sensor readings that could indicate faulty equipment, abnormal operating conditions, or the need for maintenance. By grouping similar data points together, clustering allows you to spot outliers—those points that do not fit well into any group—which are often the very anomalies engineers need to find.

1234567891011121314151617181920
import numpy as np from sklearn.cluster import KMeans # Simulated vibration sensor data from a machine (in mm/s) # Most readings are normal, but a few are unusually high or low vibration_data = np.array([ [2.1], [2.3], [2.2], [2.0], [2.4], [2.3], [2.2], [2.5], [2.1], [2.2], [8.0], # Possible anomaly (very high) [1.9], [2.0], [2.1], [2.2], [2.3], [2.2], [2.1], [2.4], [2.3], [0.5], # Possible anomaly (very low) [2.2], [2.3], [2.1], [2.4], [2.2] ]) # Cluster into 2 groups (normal and abnormal) kmeans = KMeans(n_clusters=2, random_state=42) labels = kmeans.fit_predict(vibration_data) # Print cluster centers and labels print("Cluster centers:", kmeans.cluster_centers_.flatten()) print("Labels:", labels)
copy

After clustering the vibration sensor data, you can interpret the results by examining the cluster centers and the labels assigned to each data point. The cluster centers represent the typical vibration levels for each group. In this example, you should see one cluster center near the normal operating vibration (around 2.2 mm/s) and another further away, capturing the abnormal values. Data points assigned to the cluster with a center far from the norm may be considered suspicious. By reviewing which points belong to which cluster, you gain insight into which readings are typical and which may indicate a problem with the machine.

123456789101112131415161718192021222324
import numpy as np from sklearn.cluster import KMeans # Same vibration data as before vibration_data = np.array([ [2.1], [2.3], [2.2], [2.0], [2.4], [2.3], [2.2], [2.5], [2.1], [2.2], [8.0], [1.9], [2.0], [2.1], [2.2], [2.3], [2.2], [2.1], [2.4], [2.3], [0.5], [2.2], [2.3], [2.1], [2.4], [2.2] ]) # Fit KMeans kmeans = KMeans(n_clusters=2, random_state=42) labels = kmeans.fit_predict(vibration_data) # Compute distances to assigned cluster center distances = np.abs(vibration_data.flatten() - kmeans.cluster_centers_[labels].flatten()) # Find the indices of the farthest points (potential anomalies) anomaly_indices = distances.argsort()[-2:] # Top 2 farthest points print("Potential anomalies at indices:", anomaly_indices) print("Anomalous vibration values:", vibration_data[anomaly_indices].flatten())
copy

1. What is the purpose of clustering in engineering data analysis?

2. Which scikit-learn class is used for KMeans clustering?

3. How can clustering help identify faulty equipment?

question mark

What is the purpose of clustering in engineering data analysis?

Select the correct answer

question mark

Which scikit-learn class is used for KMeans clustering?

Select the correct answer

question mark

How can clustering help identify faulty equipment?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2
some-alt