Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Ordinal Encoding | Processing Categorical Data
course content

Course Content

Data Preprocessing

Ordinal EncodingOrdinal Encoding

If one-hot encoding has transformed a categorical variable into a binary form, then ordinal encoding uses a different transformation algorithm. But let's start with what data it is used for.

Ordinal encoding is a technique to encode categorical variables into numerical values based on the order or rank of the categories. It is best used when there is a clear category ranking or order. For example, in a survey asking respondents to rate their satisfaction with a product, the options may be "Very Satisfied", "Satisfied", "Neutral", "Dissatisfied", or "Very Dissatisfied." These options can be encoded as 5, 4, 3, 2, and 1.

Ordinal encoding - is a useful method of encoding categorical data when the categories have a natural order or ranking. However, it should be used with caution, as it assumes that the distance between each category is equal, which may not always be the case. Additionally, ordinal encoding may not be suitable for algorithms that assume a linear relationship between the encoded categories, such as linear regression or neural networks.

Ordinal encoding takes into account the order in which the categorical variables are found, i.e. before using it, it is important to sort the variables from the lowest category to the highest.

Here's how to use ordinal encoding in Python:

The .fit_transform() method of the OrdinalEncoder class fits the encoder to the categorical variables and transforms them into numerical values.

Task

Read the 'controls.csv' dataset and transform the 'Education_Level' column with ordinal encoding.

Everything was clear?

Section 3. Chapter 3
toggle bottom row
course content

Course Content

Data Preprocessing

Ordinal EncodingOrdinal Encoding

If one-hot encoding has transformed a categorical variable into a binary form, then ordinal encoding uses a different transformation algorithm. But let's start with what data it is used for.

Ordinal encoding is a technique to encode categorical variables into numerical values based on the order or rank of the categories. It is best used when there is a clear category ranking or order. For example, in a survey asking respondents to rate their satisfaction with a product, the options may be "Very Satisfied", "Satisfied", "Neutral", "Dissatisfied", or "Very Dissatisfied." These options can be encoded as 5, 4, 3, 2, and 1.

Ordinal encoding - is a useful method of encoding categorical data when the categories have a natural order or ranking. However, it should be used with caution, as it assumes that the distance between each category is equal, which may not always be the case. Additionally, ordinal encoding may not be suitable for algorithms that assume a linear relationship between the encoded categories, such as linear regression or neural networks.

Ordinal encoding takes into account the order in which the categorical variables are found, i.e. before using it, it is important to sort the variables from the lowest category to the highest.

Here's how to use ordinal encoding in Python:

The .fit_transform() method of the OrdinalEncoder class fits the encoder to the categorical variables and transforms them into numerical values.

Task

Read the 'controls.csv' dataset and transform the 'Education_Level' column with ordinal encoding.

Everything was clear?

Section 3. Chapter 3
toggle bottom row
some-alt