Data Types
Let's talk about the types of data that dataframe may contain.
Numerical
Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info()
to check the data types for each column.
Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.
Categorical
Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.
Fortunately, the dataset titanic
already contains numerical data as int64
and float64
.
Swipe to start coding
Let's divide the columns into numerical and categorical. Create num_cols
as numpy array, including types int
and float
. Let the cat_cols
be all other features except the num_cols
.
Soluzione
Grazie per i tuoi commenti!
single
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione
Awesome!
Completion rate improved to 5.56
Data Types
Scorri per mostrare il menu
Let's talk about the types of data that dataframe may contain.
Numerical
Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info()
to check the data types for each column.
Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.
Categorical
Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.
Fortunately, the dataset titanic
already contains numerical data as int64
and float64
.
Swipe to start coding
Let's divide the columns into numerical and categorical. Create num_cols
as numpy array, including types int
and float
. Let the cat_cols
be all other features except the num_cols
.
Soluzione
Grazie per i tuoi commenti!
single