Aprenda Introduction to pandas for Automation | Data Manipulation and Analysis for Automation

Deslize para mostrar o menu

When you work as an automation engineer, you often need to process and analyze large amounts of structured data—think of tables of sensor readings, device logs, or batch test results. The pandas library is a powerful tool in Python designed for exactly this kind of work. It allows you to easily load, manipulate, analyze, and summarize data, making it a staple for anyone automating data-driven workflows.

Pandas excels at handling tabular data—data organized into rows and columns, similar to spreadsheets or SQL tables. Its primary data structure, the DataFrame, lets you store and operate on labeled data efficiently. This is especially useful in automation, where you might want to process logs, summarize device performance, or transform sensor outputs with minimal manual effort. With pandas, you can quickly filter, group, aggregate, and visualize your data, saving time and reducing errors compared to manual methods.


              1234567891011
            
import pandas as pd

# Create a DataFrame with device measurements
data = {
    "device_id": ["A1", "A2", "A3", "A4"],
    "temperature": [72.5, 75.0, 71.2, 73.8],
    "pressure": [101.2, 100.8, 101.5, 100.9]
}

df = pd.DataFrame(data)
print(df)

The output above shows a DataFrame, pandas' core data structure. A DataFrame is like a table: each column has a name (such as "device_id", "temperature", or "pressure"), and each row represents a measurement from a device. You can access data by column, row, or even by conditions you specify.

Columns in a DataFrame are similar to keys in a dictionary, and you can select them using square brackets. For example, df["temperature"] gives you all the temperature readings. This structure makes it easy to perform operations—like calculating averages or finding maximum values—on specific columns without writing complex loops.


              123
            
# Select the "temperature" column and calculate its mean value
mean_temp = df["temperature"].mean()
print("Average temperature:", mean_temp)