Replacing Specific Elements
The next step we need to do is to replace dots. This task is a bit harder than the previous one, since you will replace only specific elements.
First, let's remind how to select specific rows and columns based on some condition. It can be done by applying the .loc[] property. The first parameter is either row numbersm, or condition; the second one is column names. For instance, let's get the rows containing only dot characters . within the 'morgh' column.
1234567# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Output only dot values within the 'morgh' column print(df.loc[df.morgh == '.', 'morgh'])
Since we accessed the necessary rows, we can easily replace them by reassigning. We are going to repalce all the dots by NA values (nan from NumPy) and then convert the resulting column to float type (NA doesn't support int type, float only).
1234567891011# Importing libraries import pandas as pd import numpy as np # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Perform a replacement df.loc[df.morgh == '.', 'morgh'] = np.nan # Converting df.morgh = df.morgh.astype(float) print(df.morgh)
As you can see, the column is now considered to have float type, which means you can apply numerical methods to it (i.e., you can calculate mean, min, max, etc.).
Дякуємо за ваш відгук!
Запитати АІ
Запитати АІ
Запитайте про що завгодно або спробуйте одне із запропонованих запитань, щоб почати наш чат
Запитайте мені питання про цей предмет
Сумаризуйте цей розділ
Покажіть реальні приклади
Awesome!
Completion rate improved to 2.56
Replacing Specific Elements
Свайпніть щоб показати меню
The next step we need to do is to replace dots. This task is a bit harder than the previous one, since you will replace only specific elements.
First, let's remind how to select specific rows and columns based on some condition. It can be done by applying the .loc[] property. The first parameter is either row numbersm, or condition; the second one is column names. For instance, let's get the rows containing only dot characters . within the 'morgh' column.
1234567# Importing the library import pandas as pd # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Output only dot values within the 'morgh' column print(df.loc[df.morgh == '.', 'morgh'])
Since we accessed the necessary rows, we can easily replace them by reassigning. We are going to repalce all the dots by NA values (nan from NumPy) and then convert the resulting column to float type (NA doesn't support int type, float only).
1234567891011# Importing libraries import pandas as pd import numpy as np # Reading the file df = pd.read_csv('https://codefinity-content-media.s3.eu-west-1.amazonaws.com/f2947b09-5f0d-4ad9-992f-ec0b87cd4b3f/data1.csv') # Perform a replacement df.loc[df.morgh == '.', 'morgh'] = np.nan # Converting df.morgh = df.morgh.astype(float) print(df.morgh)
As you can see, the column is now considered to have float type, which means you can apply numerical methods to it (i.e., you can calculate mean, min, max, etc.).
Дякуємо за ваш відгук!