Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Вивчайте Learning Efficiency Curves | Applied AL Concepts
Quizzes & Challenges
Quizzes
Challenges
/
Active Learning with Python

bookLearning Efficiency Curves

Understanding how efficiently an Active Learning (AL) system improves with more labeled data is crucial for evaluating its effectiveness. Learning curves provide a visual tool for this purpose: they plot model accuracy (or another performance metric) against the number of labeled samples acquired during AL iterations. These curves help you see how quickly your model benefits from new information, and how much data is needed to reach a desired level of performance. In AL, the goal is to achieve high accuracy with as few labeled samples as possible, so the shape and steepness of your learning curve can reveal how well your sampling strategy is working.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Simulate a pool of unlabeled data X, y = make_classification(n_samples=1200, n_features=20, n_informative=15, n_classes=2, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Start with a small labeled set initial_idx = np.random.choice(range(len(X_train)), size=20, replace=False) labeled_idx = list(initial_idx) unlabeled_idx = list(set(range(len(X_train))) - set(labeled_idx)) accuracies = [] labeled_set_sizes = [] # Simulate AL iterations for i in range(10): clf = RandomForestClassifier(random_state=42) clf.fit(X_train[labeled_idx], y_train[labeled_idx]) y_pred = clf.predict(X_test) acc = accuracy_score(y_test, y_pred) accuracies.append(acc) labeled_set_sizes.append(len(labeled_idx)) # Select 20 most uncertain samples (simulate with random selection here) if len(unlabeled_idx) >= 20: new_samples = np.random.choice(unlabeled_idx, size=20, replace=False) labeled_idx.extend(new_samples) unlabeled_idx = list(set(unlabeled_idx) - set(new_samples)) plt.plot(labeled_set_sizes, accuracies, marker='o') plt.xlabel('Number of Labeled Samples') plt.ylabel('Accuracy') plt.title('Learning Curve: Accuracy vs. Labeled Set Size') plt.grid(True) plt.show()
copy
Note
Note

A learning curve in Active Learning shows how efficiently a model improves as more labeled data is added. A steep curve means rapid accuracy gains from each new label—this is ideal. A flat curve suggests new labels add little value. Comparing curves helps you see which AL strategy achieves high accuracy with fewer labels.

1. What does a steeper learning curve indicate in the context of Active Learning?

2. Which metric is most relevant for comparing AL strategies?

question mark

What does a steeper learning curve indicate in the context of Active Learning?

Select the correct answer

question mark

Which metric is most relevant for comparing AL strategies?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2

Запитати АІ

expand

Запитати АІ

ChatGPT

Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат

bookLearning Efficiency Curves

Свайпніть щоб показати меню

Understanding how efficiently an Active Learning (AL) system improves with more labeled data is crucial for evaluating its effectiveness. Learning curves provide a visual tool for this purpose: they plot model accuracy (or another performance metric) against the number of labeled samples acquired during AL iterations. These curves help you see how quickly your model benefits from new information, and how much data is needed to reach a desired level of performance. In AL, the goal is to achieve high accuracy with as few labeled samples as possible, so the shape and steepness of your learning curve can reveal how well your sampling strategy is working.

123456789101112131415161718192021222324252627282930313233343536373839
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Simulate a pool of unlabeled data X, y = make_classification(n_samples=1200, n_features=20, n_informative=15, n_classes=2, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) # Start with a small labeled set initial_idx = np.random.choice(range(len(X_train)), size=20, replace=False) labeled_idx = list(initial_idx) unlabeled_idx = list(set(range(len(X_train))) - set(labeled_idx)) accuracies = [] labeled_set_sizes = [] # Simulate AL iterations for i in range(10): clf = RandomForestClassifier(random_state=42) clf.fit(X_train[labeled_idx], y_train[labeled_idx]) y_pred = clf.predict(X_test) acc = accuracy_score(y_test, y_pred) accuracies.append(acc) labeled_set_sizes.append(len(labeled_idx)) # Select 20 most uncertain samples (simulate with random selection here) if len(unlabeled_idx) >= 20: new_samples = np.random.choice(unlabeled_idx, size=20, replace=False) labeled_idx.extend(new_samples) unlabeled_idx = list(set(unlabeled_idx) - set(new_samples)) plt.plot(labeled_set_sizes, accuracies, marker='o') plt.xlabel('Number of Labeled Samples') plt.ylabel('Accuracy') plt.title('Learning Curve: Accuracy vs. Labeled Set Size') plt.grid(True) plt.show()
copy
Note
Note

A learning curve in Active Learning shows how efficiently a model improves as more labeled data is added. A steep curve means rapid accuracy gains from each new label—this is ideal. A flat curve suggests new labels add little value. Comparing curves helps you see which AL strategy achieves high accuracy with fewer labels.

1. What does a steeper learning curve indicate in the context of Active Learning?

2. Which metric is most relevant for comparing AL strategies?

question mark

What does a steeper learning curve indicate in the context of Active Learning?

Select the correct answer

question mark

Which metric is most relevant for comparing AL strategies?

Select the correct answer

Все було зрозуміло?

Як ми можемо покращити це?

Дякуємо за ваш відгук!

Секція 3. Розділ 2
some-alt