Course Content
Pandas First Steps
Pandas First Steps
Adding a New Column
We've learned how to create a DataFrame
. Now let's explore what we can do with it. First, we'll create a compact DataFrame
consisting of 3 columns and 7 rows.
import pandas as pd countries_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(countries_data) print(countries)
You can expand the DataFrame by adding new columns, and the most common way to do it is as follows:
-
dataframe
is the name of our existing DataFrame to which we'll add new columns; -
name_of_new_column
is the name you're giving to the new column you're adding; -
value_1, value_2, value_3
are the values that will populate the new column.
Note
The name of the new column should be enclosed in quotation marks and wrapped in square brackets, such as
['NewColumnName']
. The values assigned to the new column should also be within square brackets, for example,data['NewColumnName'] = [value1, value2, value3]
. If the values are numeric, they can be written without quotes, like[1, 2, 3]
. If the values are strings, each one should be enclosed in quotes, like['A', 'B', 'C']
.
Now, we'll add a 'population'
column to our pre-existing countries
DataFrame.
import pandas as pd countries_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(countries_data) countries['population'] = [61399000, 75967000, 39244, 380200, 10380491, 5496000, 2424200] print(countries)
You can also use dot notation (e.g., df.column
) for accessing existing columns, but it cannot be used to create new columns. Always use square brackets (e.g., df['column']
) for this purpose.
import pandas as pd countries_data = {'country' : ['Thailand', 'Philippines', 'Monaco', 'Malta', 'Sweden', 'Paraguay', 'Latvia'], 'continent' : ['Asia', 'Asia', 'Europe', 'Europe', 'Europe', 'South America', 'Europe'], 'capital':['Bangkok', 'Manila', 'Monaco', 'Valletta', 'Stockholm', 'Asuncion', 'Riga']} countries = pd.DataFrame(countries_data) countries.population = [61399000, 75967000, 39244, 380200, 10380491, 5496000, 2424200] print(countries)
As expected, the 'population'
column was not created since Pandas doesn't allow columns to be created using this approach.
Thanks for your feedback!