Course Content
Data Preprocessing
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
6. Moving on to Tasks
Data Preprocessing
Data Types
The main tool we will use to manipulate data is pandas
. We can start right away by loading the data:
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/processed_images/img_1/4.png)
As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes
property:
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/processed_images/img_1/2.png)
Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype()
method:
Task
Read the penguins.csv
dataset and change the data type in the body_mass_g
column from float
to int
.
Don't modify the initial code, only replace the gaps ___
with the correct code.
Once you've completed this task, click the button below the code to check your solution.
Everything was clear?
Course Content
Data Preprocessing
2. Processing Quantitative Data
3. Processing Categorical Data
4. Time Series Data Processing
6. Moving on to Tasks
Data Preprocessing
Data Types
The main tool we will use to manipulate data is pandas
. We can start right away by loading the data:
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/processed_images/img_1/4.png)
As you understand, each dataset can contain many different data types, for example, numeric (integers, floating point numbers), strings (str), and datetime. To find out what data type a column has, you can call the .dtypes
property:
![](https://codefinity-content-media.s3.eu-west-1.amazonaws.com/9c23bf60-276c-4989-a9d7-3091716b4507/processed_images/img_1/2.png)
Let's say you have a column with numeric values but in string format and want to change the data type to numeric. To do this, use the .astype()
method:
Task
Read the penguins.csv
dataset and change the data type in the body_mass_g
column from float
to int
.
Don't modify the initial code, only replace the gaps ___
with the correct code.
Once you've completed this task, click the button below the code to check your solution.
Everything was clear?