Learn Introduction to Content Analysis | Automation and Content Analysis in the Newsroom

Swipe to show menu

Content analysis is a powerful tool that lets you examine large volumes of text—such as news articles, reports, or social media posts—to uncover patterns, trends, or underlying themes. For journalists, content analysis is essential for understanding how stories are framed, tracking the prevalence of certain topics, and identifying shifts in public discourse. Common use cases include measuring the frequency of keywords or phrases, detecting sentiment, and comparing coverage across different outlets. With Python, you can automate these tasks, making it possible to analyze hundreds or thousands of articles quickly and accurately. This approach helps you move beyond anecdotal impressions and base your reporting on solid evidence.


              123456789101112131415161718192021222324
            
# Count the frequency of keywords in a collection of news articles

articles = [
    "The mayor announced a new housing policy today.",
    "City council debates new housing measures.",
    "Housing shortage continues to affect thousands.",
    "New housing policy aims to solve the crisis.",
    "Critics say housing policy does not go far enough."
]

keywords = ["housing", "policy", "mayor", "council", "crisis"]

# Convert all articles to lowercase for consistent matching
articles_lower = [article.lower() for article in articles]

# Count keyword frequencies
keyword_counts = {}
for keyword in keywords:
    count = sum(article.count(keyword) for article in articles_lower)
    keyword_counts[keyword] = count

print("Keyword frequencies:")
for keyword, count in keyword_counts.items():
    print(f"{keyword}: {count}")

By analyzing how often certain keywords appear in your articles, you gain insight into which topics dominate your coverage. For instance, if the word housing appears far more frequently than mayor or council, it suggests the main focus of these stories centers on housing issues rather than political figures. This type of keyword analysis can also help reveal potential bias or recurring themes in reporting. If some terms are overrepresented or others are rarely mentioned, it might indicate an imbalance in how stories are framed. Python makes it easy to perform this analysis, allowing you to quickly spot patterns that might influence your editorial decisions.


              123456789101112131415161718192021222324252627
            
# Identify the most common words in a set of news headlines

headlines = [
    "Mayor launches new housing initiative",
    "Housing crisis deepens in the city",
    "Council approves housing policy",
    "New policy aims to address housing shortage",
    "Critics question effectiveness of housing plan"
]

from collections import Counter

# Split headlines into words and convert to lowercase
all_words = []
for headline in headlines:
    words = headline.lower().split()
    all_words.extend(words)

# Count word frequencies
word_counts = Counter(all_words)

# Get the five most common words
most_common = word_counts.most_common(5)

print("Most common words in headlines:")
for word, count in most_common:
    print(f"{word}: {count}")

1. What is content analysis in journalism?

2. How can keyword frequency help journalists?

3. Fill in the blank: To count occurrences of a word in a string, use _ _ _.

Everything was clear?

Thanks for your feedback!

Section 3. Chapter 2

Ask AI

Ask anything or try one of the suggested questions to begin our chat

Section 3. Chapter 2