Sección 4. Capítulo 3
single
Challenge: Async Web Scraper with Rate Limiting
Desliza para mostrar el menú
You are building a production-ready async data pipeline that fetches posts and their comments from jsonplaceholder.typicode.com. The pipeline must handle concurrency, apply rate limiting, enforce timeouts, and gracefully recover from individual failures.
The following endpoints are available:
https://jsonplaceholder.typicode.com/posts/{post_id}– returns a post object withtitleanduserIdfields;https://jsonplaceholder.typicode.com/comments?postId={post_id}– returns a list of comments for a given post.
Tarea
Desliza para comenzar a programar
- Define an async function
fetch_post(client, semaphore, post_id)that:- uses the provided
semaphoreto limit concurrency; - fetches the post with a
4.0second timeout usingasyncio.wait_for(); - returns a dict with keys
post_id,title, anduser_id; - returns
Noneon any exception.
- uses the provided
- Define an async function
fetch_comment_count(client, semaphore, post_id)that:- uses the provided
semaphoreto limit concurrency; - fetches the comments list with a
4.0second timeout; - returns the number of comments as an integer;
- returns
0on any exception.
- uses the provided
- Define an async function
build_report(client, semaphore, post_id)that:- calls
fetch_post()andfetch_comment_count()concurrently usingasyncio.gather(); - returns
Noneiffetch_post()returnedNone; - otherwise returns a formatted string:
"[User {user_id}] {title} – {comment_count} comments".
- calls
- Define an async function
main()that:- creates an
asyncio.Semaphorewith a limit of5; - builds reports for post IDs
1through10concurrently usingasyncio.gather(); - prints each non-
Noneresult on a separate line.
- creates an
- Run
main()usingasyncio.run().
Solución
¿Todo estuvo claro?
¡Gracias por tus comentarios!
Sección 4. Capítulo 3
single
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla