Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Using Pandas for Test Data Manipulation | Analyzing and Visualizing Test Data
Python for QA Engineers

bookUsing Pandas for Test Data Manipulation

Pandas is a powerful Python library that makes it easy to work with structured data, especially tables similar to those found in spreadsheets or databases. In QA engineering, you often need to analyze test results, summarize outcomes, or filter specific test cases from large datasets. Pandas provides the DataFrame—a two-dimensional, labeled data structure—that is ideal for organizing, manipulating, and analyzing tabular test data efficiently. With DataFrames, you can quickly load, inspect, and transform your test results to answer key QA questions.

123456789101112131415
import pandas as pd # Sample test case data as a list of dictionaries test_cases = [ {"id": 1, "status": "PASS", "duration": 2.5}, {"id": 2, "status": "FAIL", "duration": 3.1}, {"id": 3, "status": "PASS", "duration": 1.8}, {"id": 4, "status": "FAIL", "duration": 2.9}, {"id": 5, "status": "PASS", "duration": 2.2} ] # Create a DataFrame from the list of dictionaries df = pd.DataFrame(test_cases) print(df)
copy

Once your test data is in a pandas DataFrame, you can use built-in methods to filter, sort, and summarize information. Filtering allows you to focus on tests with a certain status, such as only the failed ones. Sorting helps you find the fastest or slowest test cases by duration. Summarizing, such as calculating averages, reveals trends like the typical duration of passing tests. These operations are essential for QA engineers who need to quickly identify issues or monitor test suite performance.

123456789
# Filter the DataFrame to show only failed tests failed_tests = df[df['status'] == 'FAIL'] print("Failed tests:") print(failed_tests) # Calculate the average duration of passing tests passing_tests = df[df['status'] == 'PASS'] average_duration = passing_tests['duration'].mean() print("Average duration of passing tests:", average_duration)
copy

1. What is a pandas DataFrame?

2. How can you filter a DataFrame to show only failed tests?

3. Fill in the blank: df[df['status'] == 'FAIL'] returns _____

question mark

What is a pandas DataFrame?

Select the correct answer

question mark

How can you filter a DataFrame to show only failed tests?

Select the correct answer

question-icon

Fill in the blank: df[df['status'] == 'FAIL'] returns _____

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 2

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Suggested prompts:

How can I filter the DataFrame to show only passing tests?

Can you show me how to sort the test cases by duration?

What other summary statistics can I calculate with this data?

bookUsing Pandas for Test Data Manipulation

Svep för att visa menyn

Pandas is a powerful Python library that makes it easy to work with structured data, especially tables similar to those found in spreadsheets or databases. In QA engineering, you often need to analyze test results, summarize outcomes, or filter specific test cases from large datasets. Pandas provides the DataFrame—a two-dimensional, labeled data structure—that is ideal for organizing, manipulating, and analyzing tabular test data efficiently. With DataFrames, you can quickly load, inspect, and transform your test results to answer key QA questions.

123456789101112131415
import pandas as pd # Sample test case data as a list of dictionaries test_cases = [ {"id": 1, "status": "PASS", "duration": 2.5}, {"id": 2, "status": "FAIL", "duration": 3.1}, {"id": 3, "status": "PASS", "duration": 1.8}, {"id": 4, "status": "FAIL", "duration": 2.9}, {"id": 5, "status": "PASS", "duration": 2.2} ] # Create a DataFrame from the list of dictionaries df = pd.DataFrame(test_cases) print(df)
copy

Once your test data is in a pandas DataFrame, you can use built-in methods to filter, sort, and summarize information. Filtering allows you to focus on tests with a certain status, such as only the failed ones. Sorting helps you find the fastest or slowest test cases by duration. Summarizing, such as calculating averages, reveals trends like the typical duration of passing tests. These operations are essential for QA engineers who need to quickly identify issues or monitor test suite performance.

123456789
# Filter the DataFrame to show only failed tests failed_tests = df[df['status'] == 'FAIL'] print("Failed tests:") print(failed_tests) # Calculate the average duration of passing tests passing_tests = df[df['status'] == 'PASS'] average_duration = passing_tests['duration'].mean() print("Average duration of passing tests:", average_duration)
copy

1. What is a pandas DataFrame?

2. How can you filter a DataFrame to show only failed tests?

3. Fill in the blank: df[df['status'] == 'FAIL'] returns _____

question mark

What is a pandas DataFrame?

Select the correct answer

question mark

How can you filter a DataFrame to show only failed tests?

Select the correct answer

question-icon

Fill in the blank: df[df['status'] == 'FAIL'] returns _____

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 2. Kapitel 2
some-alt