Replacing Specific Elements
The next step we need to do is to replace dots. This task is a bit harder than the previous one, since you will replace only specific elements.
First, let's remind how to select specific rows and columns based on some condition. It can be done by applying the .loc[] property. The first parameter is either row numbersm, or condition; the second one is column names. For instance, let's get the rows containing only dot characters . within the 'morgh' column.
1234567# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Output only dot values within the 'morgh' column print(df.loc[df.morgh == '.', 'morgh'])
Since we accessed the necessary rows, we can easily replace them by reassigning. We are going to repalce all the dots by NA values (nan from NumPy) and then convert the resulting column to float type (NA doesn't support int type, float only).
1234567891011# Importing libraries import pandas as pd import numpy as np # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Perform a replacement df.loc[df.morgh == '.', 'morgh'] = np.nan # Converting df.morgh = df.morgh.astype(float) print(df.morgh)
As you can see, the column is now considered to have float type, which means you can apply numerical methods to it (i.e., you can calculate mean, min, max, etc.).
Grazie per i tuoi commenti!
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione
Mi faccia domande su questo argomento
Riassuma questo capitolo
Mostri esempi dal mondo reale
Awesome!
Completion rate improved to 2.56
Replacing Specific Elements
Scorri per mostrare il menu
The next step we need to do is to replace dots. This task is a bit harder than the previous one, since you will replace only specific elements.
First, let's remind how to select specific rows and columns based on some condition. It can be done by applying the .loc[] property. The first parameter is either row numbersm, or condition; the second one is column names. For instance, let's get the rows containing only dot characters . within the 'morgh' column.
1234567# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Output only dot values within the 'morgh' column print(df.loc[df.morgh == '.', 'morgh'])
Since we accessed the necessary rows, we can easily replace them by reassigning. We are going to repalce all the dots by NA values (nan from NumPy) and then convert the resulting column to float type (NA doesn't support int type, float only).
1234567891011# Importing libraries import pandas as pd import numpy as np # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Perform a replacement df.loc[df.morgh == '.', 'morgh'] = np.nan # Converting df.morgh = df.morgh.astype(float) print(df.morgh)
As you can see, the column is now considered to have float type, which means you can apply numerical methods to it (i.e., you can calculate mean, min, max, etc.).
Grazie per i tuoi commenti!