Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Types of Data | Machine Learning Concepts
ML Introduction with scikit-learn

Types of DataTypes of Data

Each column(feature) in a training set has a datatype associated with it. Those datatypes can be grouped into Numerical, Categorical, and Date and(or) Time.

Unfortunately, most ML algorithms only work well with numbers. So we need a way to convert the categorical data and the datetime data to numbers.

Regarding date and time, you can use features like 'year', 'month', etc., based on your task. Those features are numerical values, so there is no problem with them.
Categorical data is a little more challenging to deal with. Let's look at the types of categorical data.

Types of categorical data

  • Ordinal data is a type of categorical data in which categories follow a natural order.
    For example, level of education (from elementary school to Ph.D.) or rates (from very bad to very well), etc;
  • Nominal data is a type of categorical data that follows no natural order.
    For example, name, gender, country of origin, etc.

As you will see in later chapters, converting ordinal and nominal data types to numerical values is different, that's why we need to separate them.

Note

There are better ways to convert dates to numerical values that are out of the scope of this introductory course.
For example, if we just use the 'month' feature, it will not consider that 12th month is actually closer to 1st than to 9th.

question-icon

Match the feature and its data type.

price (100, 235) –
color (blue, orange) –

Academic grades (A, B, C, and so on) –

Натисніть або перетягніть елементи та заповніть пропуски

Все було зрозуміло?

Секція 1. Розділ 4
course content

Зміст курсу

ML Introduction with scikit-learn

Types of DataTypes of Data

Each column(feature) in a training set has a datatype associated with it. Those datatypes can be grouped into Numerical, Categorical, and Date and(or) Time.

Unfortunately, most ML algorithms only work well with numbers. So we need a way to convert the categorical data and the datetime data to numbers.

Regarding date and time, you can use features like 'year', 'month', etc., based on your task. Those features are numerical values, so there is no problem with them.
Categorical data is a little more challenging to deal with. Let's look at the types of categorical data.

Types of categorical data

  • Ordinal data is a type of categorical data in which categories follow a natural order.
    For example, level of education (from elementary school to Ph.D.) or rates (from very bad to very well), etc;
  • Nominal data is a type of categorical data that follows no natural order.
    For example, name, gender, country of origin, etc.

As you will see in later chapters, converting ordinal and nominal data types to numerical values is different, that's why we need to separate them.

Note

There are better ways to convert dates to numerical values that are out of the scope of this introductory course.
For example, if we just use the 'month' feature, it will not consider that 12th month is actually closer to 1st than to 9th.

question-icon

Match the feature and its data type.

price (100, 235) –
color (blue, orange) –

Academic grades (A, B, C, and so on) –

Натисніть або перетягніть елементи та заповніть пропуски

Все було зрозуміло?

Секція 1. Розділ 4
some-alt