Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Feature Scaling | Clustering
Clustering Demystified

Feature ScalingFeature Scaling

Feature scaling is a technique used to standardize the range of independent variables or features of data. In machine learning, it is a step of data pre-processing that aims to normalize the data dimensions so that they are on a similar scale. This is important because many machine learning algorithms use some form of distance measure, such as Euclidean distance, to compare observations. If the scale of the data is not consistent, certain features will have a much larger influence on the distance measure than others, which can lead to poor performance in some machine learning algorithms.

There are different ways to perform feature scalings, such as normalization, standardization, and Min-Max scaling.

  • Min-Max scaling scales the data to a given range, usually between 0 and 1;
  • Standardization scales the data so that it has a mean of 0 and a standard deviation of 1;
  • Normalization scales the data so that it has a minimum value of 0 and a maximum value of 1.

It's important to note that the feature scaling should be done only on the independent variable(s) and not on the dependent variable.

Methods description

  • MinMaxScaler: MinMaxScaler is a class within the sklearn.preprocessing module. It scales features to a specified range, typically between 0 and 1, by subtracting the minimum and dividing by the difference between the maximum and minimum values;
  • X.columns: Assuming X is a DataFrame, X.columns returns the column labels of the DataFrame X;
  • MinMaxScaler.fit_transform(X): This method fits the scaler to the data and transforms the data simultaneously. It computes the minimum and maximum values of the data and then scales the data accordingly.
Завдання виконано!

ЗавданняВиконано

  1. Import the MinMaxScaler module.
  2. Create the instance of MinMaxScaler().
  3. Create a new DataFrame with the scaled columns.

Все було зрозуміло?

Секція 1. Розділ 7
course content

Зміст курсу

Clustering Demystified

Feature ScalingFeature Scaling

Feature scaling is a technique used to standardize the range of independent variables or features of data. In machine learning, it is a step of data pre-processing that aims to normalize the data dimensions so that they are on a similar scale. This is important because many machine learning algorithms use some form of distance measure, such as Euclidean distance, to compare observations. If the scale of the data is not consistent, certain features will have a much larger influence on the distance measure than others, which can lead to poor performance in some machine learning algorithms.

There are different ways to perform feature scalings, such as normalization, standardization, and Min-Max scaling.

  • Min-Max scaling scales the data to a given range, usually between 0 and 1;
  • Standardization scales the data so that it has a mean of 0 and a standard deviation of 1;
  • Normalization scales the data so that it has a minimum value of 0 and a maximum value of 1.

It's important to note that the feature scaling should be done only on the independent variable(s) and not on the dependent variable.

Methods description

  • MinMaxScaler: MinMaxScaler is a class within the sklearn.preprocessing module. It scales features to a specified range, typically between 0 and 1, by subtracting the minimum and dividing by the difference between the maximum and minimum values;
  • X.columns: Assuming X is a DataFrame, X.columns returns the column labels of the DataFrame X;
  • MinMaxScaler.fit_transform(X): This method fits the scaler to the data and transforms the data simultaneously. It computes the minimum and maximum values of the data and then scales the data accordingly.
Завдання виконано!

ЗавданняВиконано

  1. Import the MinMaxScaler module.
  2. Create the instance of MinMaxScaler().
  3. Create a new DataFrame with the scaled columns.

Все було зрозуміло?

Секція 1. Розділ 7
some-alt