Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Challenge: Threaded Web Scraper | Advanced Patterns and Best Practices
Quizzes & Challenges
Quizzes
Challenges
/
Python Multithreading and Multiprocessing

bookChallenge: Threaded Web Scraper

Imagine you are tasked with collecting information from a list of web pages, such as fetching the latest news headlines or product prices from several sites. Doing this sequentially—one after another—would be slow, especially if some pages take longer to respond. To speed up the process, you want to fetch multiple pages at the same time using threads. This approach allows you to make the most of waiting times, retrieving data from several sites concurrently and processing results as soon as they arrive.

Tâche

Swipe to start coding

Your goal is to implement a threaded web scraper that fetches the content of multiple URLs in parallel and processes the results.

  • Complete the function fetch_url_content(url) so that it downloads and returns the first 100 characters of the response content from the provided url.
  • In the main() function, use the threading module to start a new thread for each URL in the urls list, where each thread calls fetch_url_content(url) and prints the result in the format: Content from {url}: {snippet} where {url} is the URL and {snippet} is the first 100 characters returned by fetch_url_content.
  • Ensure the main thread waits for all threads to finish before exiting.

Solution

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 4. Chapitre 4
single

single

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

close

bookChallenge: Threaded Web Scraper

Glissez pour afficher le menu

Imagine you are tasked with collecting information from a list of web pages, such as fetching the latest news headlines or product prices from several sites. Doing this sequentially—one after another—would be slow, especially if some pages take longer to respond. To speed up the process, you want to fetch multiple pages at the same time using threads. This approach allows you to make the most of waiting times, retrieving data from several sites concurrently and processing results as soon as they arrive.

Tâche

Swipe to start coding

Your goal is to implement a threaded web scraper that fetches the content of multiple URLs in parallel and processes the results.

  • Complete the function fetch_url_content(url) so that it downloads and returns the first 100 characters of the response content from the provided url.
  • In the main() function, use the threading module to start a new thread for each URL in the urls list, where each thread calls fetch_url_content(url) and prints the result in the format: Content from {url}: {snippet} where {url} is the URL and {snippet} is the first 100 characters returned by fetch_url_content.
  • Ensure the main thread waits for all threads to finish before exiting.

Solution

Switch to desktopPassez à un bureau pour une pratique réelleContinuez d'où vous êtes en utilisant l'une des options ci-dessous
Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 4. Chapitre 4
single

single

some-alt