What’s the best way to scrape data from over 2000 websites efficiently?

0
8
Asked By CuriousCoder92 On

Hey everyone! I'm looking for some guidance on a project I have in mind. I want to create a website filled with data that will automatically update every month—and eventually, weekly or even daily—from more than 2000 different websites. My goal is to allow users to filter data by subjects and categories.

I don't want to share all the details because I'm concerned someone might take the idea and monetize it, but I want it to be accessible to everyone. I have connections in the field that will help me.

Now, the challenge I'm facing is figuring out how to scrape data on such a large scale. I know that scraping a single website isn't the issue, but gathering data from multiple sites presents challenges, especially since some sites require extra clicks to access information, and others use formats like PDFs or images. I'm looking for methods to extract a varied amount of data, anywhere from 4 to 300 pieces, alongside titles and text.

Is it feasible to implement this on an already built WordPress website with Elementor free? I'm aware that most scraping tools cost a good amount each month, and while I can cover some initial costs, I hope this could ultimately be a project under a foundation's banner. Thanks for taking the time to read this!

2 Answers

Answered By WebWizard99 On

Scraping data from over 2000 websites is definitely doable! You can start with a simple Python script to handle the initial scrape. The real challenge lies in extracting and integrating the data you get from all those diverse sources, rather than just the scraping process itself.

Answered By DataDiva21 On

Don't forget to respect ethical scraping practices! Make sure to review the data laws relevant to your location and the websites you're targeting. It’s important to stay on the right side of legality when you're gathering information.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.