Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Data Types | Brief Introduction
Data Preprocessing
course content

Contenu du cours

Data Preprocessing

Data Preprocessing

1. Brief Introduction
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
5. Feature Engineering
6. Moving on to Tasks

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

Tâche

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Solution

Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 1
toggle bottom row

book
Data Types

The main tool we will use to manipulate data is pandas. We can start right away by loading the data:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.head())
copy

As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes property:

12345
import pandas as pd df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/datasets/penguins.csv') print(df.dtypes)
copy

Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype() method:

Tâche

Swipe to start coding

Read the penguins.csv dataset and change the data type in the body_mass_g column from float to int.

Don't modify the initial code, only replace the gaps ___ with the correct code.

Once you've completed this task, click the button below the code to check your solution.

Solution

Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 1
Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
We're sorry to hear that something went wrong. What happened?
some-alt