Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Feature Selection vs. Feature Extraction | Introduction to Dimensionality Reduction
Dimensionality Reduction with PCA

bookFeature Selection vs. Feature Extraction

High-dimensional datasets often have more features than you need. You can reduce features using two main strategies: feature selection and feature extraction.

  • Feature selection means keeping only the most important original features - like picking your favorite fruits from a basket;
  • Feature extraction creates new features by combining or transforming the originals - like blending all the fruits into a smoothie.

Principal Component Analysis (PCA) is a common feature extraction method, which you will explore in detail later.

12345678910111213141516171819202122
import pandas as pd from sklearn.decomposition import PCA # Example dataset data = { 'height': [150, 160, 170, 180], 'weight': [50, 60, 70, 80], 'age': [20, 25, 30, 35], 'score': [85, 90, 95, 100] } df = pd.DataFrame(data) # Feature selection: pick only 'height' and 'weight' selected_features = df[['height', 'weight']] print("Selected features (feature selection):") print(selected_features) # Feature extraction: combine features using PCA (placeholder, details later) pca = PCA(n_components=2) extracted_features = pca.fit_transform(df) print("\nExtracted features (feature extraction, via PCA):") print(extracted_features)
copy
Note
Note

PCA is a powerful feature extraction technique that creates new features (principal components) from your original data. The details of how PCA works will be covered in upcoming chapters.

Reducing the number of features can help you see patterns that might be hidden in higher dimensions. Using visualization, you can plot selected features to reveal clusters or trends more clearly. For instance, plotting only the most relevant features with seaborn can make relationships in your data stand out, making it easier to interpret and analyze.

question mark

Which statement best describes the difference between feature selection and feature extraction in dimensionality reduction

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 3

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain more about how PCA works for feature extraction?

What are some other common feature selection methods?

How do I decide whether to use feature selection or feature extraction?

Awesome!

Completion rate improved to 8.33

bookFeature Selection vs. Feature Extraction

Swipe to show menu

High-dimensional datasets often have more features than you need. You can reduce features using two main strategies: feature selection and feature extraction.

  • Feature selection means keeping only the most important original features - like picking your favorite fruits from a basket;
  • Feature extraction creates new features by combining or transforming the originals - like blending all the fruits into a smoothie.

Principal Component Analysis (PCA) is a common feature extraction method, which you will explore in detail later.

12345678910111213141516171819202122
import pandas as pd from sklearn.decomposition import PCA # Example dataset data = { 'height': [150, 160, 170, 180], 'weight': [50, 60, 70, 80], 'age': [20, 25, 30, 35], 'score': [85, 90, 95, 100] } df = pd.DataFrame(data) # Feature selection: pick only 'height' and 'weight' selected_features = df[['height', 'weight']] print("Selected features (feature selection):") print(selected_features) # Feature extraction: combine features using PCA (placeholder, details later) pca = PCA(n_components=2) extracted_features = pca.fit_transform(df) print("\nExtracted features (feature extraction, via PCA):") print(extracted_features)
copy
Note
Note

PCA is a powerful feature extraction technique that creates new features (principal components) from your original data. The details of how PCA works will be covered in upcoming chapters.

Reducing the number of features can help you see patterns that might be hidden in higher dimensions. Using visualization, you can plot selected features to reveal clusters or trends more clearly. For instance, plotting only the most relevant features with seaborn can make relationships in your data stand out, making it easier to interpret and analyze.

question mark

Which statement best describes the difference between feature selection and feature extraction in dimensionality reduction

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 3
some-alt