Programming

Need Help Setting Up Cloud Processing for a 150GB Project

August 21, 2025

Asked By CloudySky2023 On August 21, 2025

I'm a complete beginner trying to work with a 150GB labeled dataset on my MacBook for a fault detection project. Currently, my whole workflow involves downloading the dataset, which causes my machine to lag, crash, and run out of memory. I know that processing this data locally isn't feasible, and I'm aware of cloud platforms like AWS and GCP, but I have no idea where to begin. Here are a few questions I have:
1. What's the first step to get my dataset onto the cloud? Should I start by uploading it to something like AWS S3?
2. Once the data is in the cloud, how do I actually run a Jupyter Notebook on it? Do I need to rent a virtual machine like an EC2 instance to connect to my data?
3. Is there a common workflow that most beginners use for projects like this?
4. How can I avoid racking up huge bills while using cloud services? What common mistakes should I be wary of?
5. What should be the very first thing I do today? Should I sign up for an AWS Free Tier account, or are there any beginner tutorials you recommend? Any advice, no matter how small, would be a huge help. Thanks!

1 Answer

Answered By DataDillinger89 On August 21, 2025

First off, try to understand your data and figure out how to process it in smaller, manageable chunks. Just moving everything to the cloud won't magically solve your issues; it might even cost you more in the long run. You need to adjust your processing strategy before going into the cloud.

ChunkingMaster - August 22, 2025

I've tracked my dataset, but could you share some strategies on chunking it? I really appreciate the insight!

Related Questions

How To: Running Codex CLI on Windows with Azure OpenAI

Set Wordpress Featured Image Using Javascript

How To Fix PHP Random Being The Same

Why no WebP Support with Wordpress

Replace Wordpress Cron With Linux Cron

Customize Yoast Canonical URL Programmatically

LEAVE A REPLY Cancel reply