Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Manipulating Strings | Preprocessing Data: Part I
Data Manipulation using pandas
course content

Contenido del Curso

Data Manipulation using pandas

Data Manipulation using pandas

1. Preprocessing Data: Part I
2. Preprocessing Data: Part II
3. Grouping Data
4. Aggregating and Visualizing Data
5. Joining Data

bookManipulating Strings

If you want to replace specific symbol in string, you need to apply the .replace() method, passing the element you want to replace as the first parameter, and the element that should take its place as the second parameter. For instance,

12345
# Initial value with comma string = "35600,0" print(string) # Replacing comma with dot print(string.replace(',','.'))
copy

As you can see, the comma was replaced by the dot symbol. But how to perform replacements for the whole column? On one side, we can run loop, and perform a replacement for every element. But it's very irrational.

pandas allows you to apply string methods to the whole column within one action. To do it, you need to set the str accessor to column, and then specify the method you want to perform. For instance, we can replace the - symbols in the address column with underscore _ characters.

1234567
# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data.csv') # Replacing - characters by _ print(df.address.str.replace('-', '_'))
copy

And finally, you need to convert modified values into new type. pandas also allows us to perform that within one action by applying the .astype() method. The parameter of the method should be the type you want to convert to (int, float, str, etc.).

¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 4
We're sorry to hear that something went wrong. What happened?
some-alt