How Can I Use AWS Lambda to Scrape URLs Periodically?

0
2
Asked By CuriousCoder92 On

I'm developing a web app that allows users to monitor certain URLs for changes and sends notifications via email when content updates occur. My idea is to set up an AWS Lambda function to check these pages every 10 minutes. Here's the workflow I have in mind:

1. The Lambda function fetches a list of URLs from a server.
2. It scrapes the content from those URLs.
3. The scraped data gets sent back to the server, which handles identifying changes and notifying users.

I'm a bit concerned about potential issues that could arise if the number of monitored pages or users grows. Does this plan seem feasible? What should I consider for scaling and performance?

5 Answers

Answered By LambdaLover On

Scaling shouldn't be a big deal with Lambda—it’s designed for this type of task. However, you may face issues with your IP getting blocked by the websites due to too many requests. Just try to balance your scraping frequency and monitor how many requests you're making as you scale up.

Answered By TechieTom On

You might run into some issues here because when scraping, the sites you target could see requests coming from AWS IPs. Some websites block these IPs due to bot protection, so keep that in mind.

Answered By WebWatcher On

Check out the GitHub project called changedetection.io. It seems like it could be a good fit if you're looking for something with webhook support!

CuriousCoder92 -

Thanks! This looks interesting, it could work for me if they offer webhooks of some sort.

Answered By ScrapingGuru On

Consider structuring your app with AWS Step Functions. You could have one Lambda function to fetch the page list and then send that to a queue, which spawns additional Lambda workers to handle the scraping and data storage. This approach improves both your architecture and scaling.

Answered By CuriousCoder92 On

Related Questions

Remove Duplicate Items From List

EAN Validator

EAN Generator

Cloudflare Cache Detector

HTTP Status Code Check

Online PDF Editor

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.