Summary  
This chapter covers creating and manipulating tabular data in pandas DataFrames, including generating summary statistics with describe() and filtering rows based on conditional queries.

General domain of usage  
Journalism

Tabular data is at the heart of many impactful journalism stories. As a journalist, you often encounter structured tables when working with government data, Freedom of Information Act (FOIA) releases, public salary disclosures, or datasets from research organizations. Tables are a popular format because they organize information into rows and columns, making it easier to compare, analyze, and spot patterns or anomalies that could lead to newsworthy stories.

import pandas as pd

# Create a DataFrame of public official salaries
data = {
    "Name": ["Alex Kim", "Jordan Lee", "Morgan Patel", "Taylor Smith", "Casey Jones"],
    "Position": ["Mayor", "City Clerk", "Fire Chief", "Police Chief", "Treasurer"],
    "Salary": [120000, 75000, 98000, 105000, 87000]
}

salaries_df = pd.DataFrame(data)

# Display summary statistics for the Salary column
salary_stats = salaries_df["Salary"].describe()
print("Salary Summary Statistics:")
print(salary_stats)

In the code above, you use the `pandas` library to create a DataFrame from a dictionary containing names, job positions, and salaries of public officials. The `pd.DataFrame()` function turns the dictionary into a structured table. Once your data is in a DataFrame, you can use the `.describe()` method to quickly generate summary statistics about the `Salary` column. This includes the count, mean, standard deviation, minimum, and maximum values, as well as quartiles. For journalists, these statistics are essential for spotting **outliers**—such as an unusually high salary—or identifying overall **trends**, like the average pay for city officials. This rapid overview helps you decide where to dig deeper for your reporting.

# Filter officials with salaries above $90,000
high_earners = salaries_df[salaries_df["Salary"] > 90000]
print("Officials earning above $90,000:")
print(high_earners)

What function in pandas is used to create a DataFrame from a dictionary?

How can filtering data help journalists find newsworthy stories?

A practical course designed for journalists and media professionals to harness the power of Python for data-driven storytelling, news automation, and media analysis. Learn to collect, analyze, and visualize data, automate repetitive newsroom tasks, and uncover insights from large datasets using Python.

Learn how to gather, clean, and prepare data for journalistic investigations and media reporting using Python.

Master the skills to analyze, interpret, and visualize media data for compelling storytelling and reporting.

Discover how Python can automate repetitive newsroom tasks and analyze media content for deeper insights.

Working with Tabular Data

1. What function in pandas is used to create a DataFrame from a dictionary?

2. How can filtering data help journalists find newsworthy stories?

3. Fill in the blank: To display the first 10 rows of a DataFrame, use _____