Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lernen DataFrames | Pandas
Unveiling the Power of Data Manipulation with Pandas

book
DataFrames

Let's start with the basics. What exactly is a DataFrame?

To recap, a pandas DataFrame is a two-dimensional, size-mutable, tabular data structure with labeled rows and columns. It is similar to a spreadsheet, an SQL table, or the data.frame in R. A DataFrame consists of a collection of Series, each of which is a one-dimensional labeled array.

You can think of a DataFrame as a group of Series objects that share an index (the column names). For example:

import pandas as pd

# Create a DataFrame from a dictionary
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Print the DataFrame
print(df)
1234567
import pandas as pd # Create a DataFrame from a dictionary df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) # Print the DataFrame print(df)
copy

The code above produces a pandas DataFrame with exactly three columns and three rows. Note that the first number for every row corresponds to the index. What if we need to access a cell in a specific position?

In pandas, loc() and iloc() are two methods to access rows and columns of a DataFrame. They are both attributes of the DataFrame object and allow you to access and manipulate the data in various ways.

The main difference between loc and iloc is that loc uses label-based indexing, while iloc uses integer-based indexing.

import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})

# Access the value at row 1, column 0
print(df.iloc[1, 0])

# Access the value at row 1, column 'B'
print(df.loc[1, 'B'])
123456789
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}) # Access the value at row 1, column 0 print(df.iloc[1, 0]) # Access the value at row 1, column 'B' print(df.loc[1, 'B'])
copy

There are many other ways to create a DataFrame, such as from a list of dictionaries, from a NumPy array, or by loading data from a file.

We have already spent so many words on this. Let's practice these concepts!

Aufgabe

Swipe to start coding

  1. Import the pandas library with the pd alias.
  2. Create a dictionary named data with the list [1, 2, 3, 4, 5] as the value for the key A.
  3. Create a new DataFrame from the data dictionary and assign it to a variable named df.
  4. Print the data type of df.
  5. Print the df DataFrame.
  6. Access the element at row index 2 and column index 2

Lösung

# Import the pandas library
import pandas as pd

# Create a dictionary of lists
data = {'A': [1, 2, 3, 4, 5], 'B': [4, 5, 6, 7, 8], 'C': [7, 8, 9, 10, 11]}

# Create a DataFrame from the dictionary
df = pd.DataFrame(data)

# Print the data type of df
print(type(df))

# Print the DataFrame
print(df)

# Access the element at row index 2 and column index 2
df.iloc[2, 2]

Mark tasks as Completed
War alles klar?

Wie können wir es verbessern?

Danke für Ihr Feedback!

Abschnitt 1. Kapitel 2
some-alt