How can I use AWS Lambda to scrape pages without running into issues?

May 2, 2025

Asked By CuriousCoder2021 On May 2, 2025

I'm building a web app where users can monitor specific URLs and get notifications via email whenever the content on those pages changes. I have some experience with AWS Lambda, and I'm planning to set up a workflow where I:

1. Store a list of URLs on a server.
2. Use a Lambda function triggered every 10 minutes to fetch this list.
3. Scrape the content from each page.
4. Send the scraped data back to my server for processing and notifying the users of any changes.

I believe this setup could work, but I'm concerned about potential problems, especially if the number of monitored pages or users increases. I'd love to hear any advice about my architecture and workflow. Does this method sound feasible? What should I consider?

10 Answers

Answered By Anonymous On May 3, 2025

Answered By Keven Krok On May 3, 2025

{

Answered By ResourcefulRex22 On May 3, 2025

Have you checked out this project: https://github.com/dgtlmoon/changedetection.io? It might be worth looking into, especially if they have webhook capabilities that could fit your needs.

CuriousCoder2021 - May 3, 2025

Thanks for the suggestion! I’ll look into it. Webhooks could be a great addition for my app.

Answered By Anonymous On May 3, 2025

Answered By CloudGuru_101 On May 3, 2025

Consider a better architecture. You could use Step Functions to manage the process, where one Lambda fetches the page list and then triggers worker Lambdas for scraping and writing to your database. This approach can help with scaling your application.

Answered By ScrappyDev89 On May 3, 2025

Be careful with AWS's outbound network. Websites will see a request coming from an AWS IP address, and many have anti-bot measures that might block requests coming from AWS directly. You might need a plan for handling that.

Answered By Anonymous On May 3, 2025

Answered By Anonymous On May 2, 2025

How can I use AWS Lambda to scrape pages without running into issues?

10 Answers

Related Questions

Cloudflare Origin SSL Certificate Setup Guide

How To Effectively Monetize A Site With Ads

LEAVE A REPLY Cancel reply