Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Data Types | Data Exploration
Preprocessing Data
course content

Contenuti del Corso

Preprocessing Data

Preprocessing Data

1. Data Exploration
2. Data Cleaning
3. Data Validation
4. Normalization & Standardization
5. Data Encoding

book
Data Types

Let's talk about the types of data that dataframe may contain.

Numerical

Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info() to check the data types for each column.

Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.

Categorical

Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.

Fortunately, the dataset titanic already contains numerical data as int64 and float64.

Compito

Swipe to start coding

Let's divide the columns into numerical and categorical. Create num_cols as numpy array, including types int and float. Let the cat_cols be all other features except the num_cols.

Soluzione

Switch to desktopCambia al desktop per esercitarti nel mondo realeContinua da dove ti trovi utilizzando una delle opzioni seguenti
Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 3
toggle bottom row

book
Data Types

Let's talk about the types of data that dataframe may contain.

Numerical

Numerical data is presented by int or float values. In the dataframe, it should be stored as int64 or float64 data types value. Use data.info() to check the data types for each column.

Note that some fields in the dataframe may contain numerical values, but are stored using some other data type (object or str). You have to convert it to the int64 or float64, and we’ll explore how to do it later.

Categorical

Categorical data has no numerical representation, it is an item from the list of some groups or categories. For example, column Sex has values Male or Female, or column Season with values Spring, Summer, Fall, and Winter. It requires special conversion and preprocessing. This data has data types: object, bool, str.

Fortunately, the dataset titanic already contains numerical data as int64 and float64.

Compito

Swipe to start coding

Let's divide the columns into numerical and categorical. Create num_cols as numpy array, including types int and float. Let the cat_cols be all other features except the num_cols.

Soluzione

Switch to desktopCambia al desktop per esercitarti nel mondo realeContinua da dove ti trovi utilizzando una delle opzioni seguenti
Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 3
Switch to desktopCambia al desktop per esercitarti nel mondo realeContinua da dove ti trovi utilizzando una delle opzioni seguenti
Siamo spiacenti che qualcosa sia andato storto. Cosa è successo?
some-alt