Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Introduction to Data in Journalism | Data Collection and Cleaning for Journalists
Python for Journalists and Media

bookIntroduction to Data in Journalism

Data-driven journalism is transforming how newsrooms investigate, report, and present stories. By leveraging data, you can uncover trends, highlight disparities, and provide evidence-based insights that go beyond anecdotes or isolated events. Impactful reporting such as uncovering government spending irregularities, analyzing election results, or tracking the spread of diseases often relies on analyzing large datasets. Major news outlets have used data to reveal systemic issues in criminal justice, health care, and climate change by presenting interactive graphics and in-depth analysis.

Python has become an essential tool in the modern newsroom. Its versatility allows you to automate data collection, clean messy information, and analyze patterns efficiently. With Python, even those without a computer science background can quickly learn to handle data, making it possible to break complex stories and support investigative reporting with solid evidence.

123456789101112131415161718
# Define a list of dictionaries representing news headlines headlines = [ {"title": "Election Results Announced in Major Cities"}, {"title": "Local Schools Receive New Funding"}, {"title": "Wildfire Threatens Rural Communities"}, {"title": "Election Debate Draws Record Viewership"}, {"title": "Scientists Discover New Species"}, {"title": "City Council Approves Affordable Housing Plan"}, {"title": "Election Officials Prepare for High Turnout"}, {"title": "Sports Team Wins Championship"}, {"title": "Community Rallies for Flood Relief"}, {"title": "Election Polls Show Tight Race"} ] # Print the first five headlines for headline in headlines[:5]: print(headline["title"])
copy

By working with in-memory data structures like lists and dictionaries, you can quickly access, filter, and review large amounts of information. In the example above, the headlines list allows you to store multiple news items and instantly retrieve or process specific entries. This ability to handle data efficiently is especially valuable when deadlines are tight and stories are developing rapidly.

123456789
# Count the number of headlines containing the word 'election' keyword = "election" count = 0 for headline in headlines: if keyword.lower() in headline["title"].lower(): count += 1 print(f"Number of headlines containing '{keyword}':", count)
copy

1. What is one benefit of using Python for data-driven journalism?

2. Which Python library is commonly used for working with tabular data in memory?

3. Why is it important for journalists to be able to process large datasets efficiently?

question mark

What is one benefit of using Python for data-driven journalism?

Select the correct answer

question mark

Which Python library is commonly used for working with tabular data in memory?

Select the correct answer

question mark

Why is it important for journalists to be able to process large datasets efficiently?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 1

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain how the code counts headlines with the word 'election'?

What other keywords could I search for in the headlines?

How can I modify the code to display the actual headlines containing 'election'?

bookIntroduction to Data in Journalism

Swipe to show menu

Data-driven journalism is transforming how newsrooms investigate, report, and present stories. By leveraging data, you can uncover trends, highlight disparities, and provide evidence-based insights that go beyond anecdotes or isolated events. Impactful reporting such as uncovering government spending irregularities, analyzing election results, or tracking the spread of diseases often relies on analyzing large datasets. Major news outlets have used data to reveal systemic issues in criminal justice, health care, and climate change by presenting interactive graphics and in-depth analysis.

Python has become an essential tool in the modern newsroom. Its versatility allows you to automate data collection, clean messy information, and analyze patterns efficiently. With Python, even those without a computer science background can quickly learn to handle data, making it possible to break complex stories and support investigative reporting with solid evidence.

123456789101112131415161718
# Define a list of dictionaries representing news headlines headlines = [ {"title": "Election Results Announced in Major Cities"}, {"title": "Local Schools Receive New Funding"}, {"title": "Wildfire Threatens Rural Communities"}, {"title": "Election Debate Draws Record Viewership"}, {"title": "Scientists Discover New Species"}, {"title": "City Council Approves Affordable Housing Plan"}, {"title": "Election Officials Prepare for High Turnout"}, {"title": "Sports Team Wins Championship"}, {"title": "Community Rallies for Flood Relief"}, {"title": "Election Polls Show Tight Race"} ] # Print the first five headlines for headline in headlines[:5]: print(headline["title"])
copy

By working with in-memory data structures like lists and dictionaries, you can quickly access, filter, and review large amounts of information. In the example above, the headlines list allows you to store multiple news items and instantly retrieve or process specific entries. This ability to handle data efficiently is especially valuable when deadlines are tight and stories are developing rapidly.

123456789
# Count the number of headlines containing the word 'election' keyword = "election" count = 0 for headline in headlines: if keyword.lower() in headline["title"].lower(): count += 1 print(f"Number of headlines containing '{keyword}':", count)
copy

1. What is one benefit of using Python for data-driven journalism?

2. Which Python library is commonly used for working with tabular data in memory?

3. Why is it important for journalists to be able to process large datasets efficiently?

question mark

What is one benefit of using Python for data-driven journalism?

Select the correct answer

question mark

Which Python library is commonly used for working with tabular data in memory?

Select the correct answer

question mark

Why is it important for journalists to be able to process large datasets efficiently?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 1
some-alt