**Categorical data** is a type of data that represents qualitative or descriptive characteristics. It is often non-numeric. These can be car brands, professions, education level, etc. But then, what is the difference between plain text data and categorical data? The main difference between categorical data and text data - is that categorical data is a structured type of data with discrete categories, while text data - is an unstructured type of data that requires additional preprocessing steps to extract relevant information. That is why, for example, the names of people in a dataset with user resumes are not categorical data but text data.

First of all, let's find out why we need to encode categorical data. Most machine learning algorithms require **numeric input data** to be able to perform their computations, so categorical data needs to be transformed into a numerical representation before it can be used.

There are many data encoding methods: label encoding, one-hot encoding, binary encoding, target encoding, and others, the differences between which we will discuss in the following chapters.

You can see the difference between one-hot encoding and label encoding in the images below:


Creating a machine learning model seems to be your most challenging and essential task. But first, we have to work with data! Learn how to process datasets and fully prepare them for use. Numerical, categorical, and temporal data await you in our course.

Different types of data? How to work with them? If your eyes are wide open, don't worry, let's start with a brief overview of the pandas library and learn how to work with it in the future.

This chapter discusses in detail how to work with quantitative data, what methods it is processed with, how data scaling and normalization differ, and much more.

Is categorical data as simple as you think it is? Find out what is the complexity of processing and working with it.


Time series data processing is the process of handling, analyzing, and preparing data that is presented as a sequence of temporally ordered values. Find out what steps it includes in this section.

Did you know that you can extract even more values from your data and create more informative features? In this section, you will learn how to work with feature engineering.

You have reached the end of this course. Let's test your knowledge! There are 3 tasks for you to solve.

Methods for Encoding the Categorical Data