What’s the best way to speed up my data scraping code?

0
1
Asked By CuriousCat456 On

Hey everyone! I'm dealing with a pretty slow code for web scraping that takes ages because I'm trying to gather a large amount of data. Since I'm new to the cloud, I'm looking for advice on which service or instance would work best to get my code running in a reasonable time. I've already tried using a t2.xlarge instance, but it still takes too long. Any suggestions?

3 Answers

Answered By DataDynamo23 On

Understanding what specifically is causing your performance bottlenecks is crucial. If your code is processing URLs one by one, you'll be waiting on server responses most of the time. Using parallel fetching will definitely help speed things up. Libraries for scraping specific to your programming language might also simplify this for you!

Answered By CodeWhisperer777 On

Upgrading from a t2.xlarge, which is pretty small and outdated, to something massive like a c6i.48xlarge could definitely help, but consider the jump in cost. There are various step-ups to check before going straight to a 48xlarge! Also, it’s important to make sure you’re fully utilizing the resources you have, like multi-threading or async requests. Have you looked into that?

Answered By TechGuru99 On

It sounds like you're not sure what's causing the slowdown in your code. Just going for bigger hardware is usually a bad strategy. You'll want to investigate whether it's CPU, memory, storage, or network that's slowing things down. Analyze your code first, then decide on your next move to avoid wasting money on unnecessary AWS services.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.