Tutorial

How Can I Mount My Local SSDs on an EC2 Instance for Limited Batch Processing?

June 8, 2025

Asked By CloudDancer92 On June 8, 2025

Hey everyone! I'm working with a series of local hard drives and I've got about 200TB of data, but I only need to access around 1GB of it at a time for model training on an EC2 instance. Since storing all that data on AWS would not only be super costly (around $2K a month) but also raises privacy and confidentiality issues, I'm looking for a solution where I can keep the data local and just use the EC2 instance for processing small batches. I know there will be latency when loading each batch from local storage to the EC2 instance and then clearing it out, but I'm willing to accept that trade-off. Is there a way to make this work, or are there better alternatives to avoid those hefty S3 storage fees for data that I won't need constantly? Thanks in advance!

5 Answers

Answered By FastCompute76 On June 10, 2025

You might want to think about using EC2 instances equipped with NVMe ephemeral storage. This setup could help with the fast I/O you need, as it would allow you to stage your data there before computing—perfect for high-performance compute tasks. Just be ready to clean up afterwards since it's temporary storage.

CloudDancer92 - June 10, 2025

That sounds like a solid option! I'd love to avoid S3 entirely if possible, though. I guess I just need to stage enough data at once to make it efficient!

Answered By SecureDataEnthusiast On June 10, 2025

If privacy is a concern, consider setting up EBS with encryption. You could upload a chunk of your data, process it, and upload another one while the first is being computed. It might help make everything safer on AWS.

Answered By DataWizard33 On June 9, 2025

Honestly, it sounds like you might be stuck with needing to upload data to AWS, at least in some form. Storage Gateway is mostly meant for moving data to S3 rather than giving you access to local storage directly on AWS. My thought is that you’d need to pick your data, upload it, and using a VPN might just slow things down further.

CloudDancer92 - June 10, 2025

That was my hunch. Still hoping for a workaround, though!

Answered By TechieNomad On June 9, 2025

Another thought: you could put your drives in a NAS and connect to it via a VPN or Transit Gateway to your VPC. It could save you a lot of trouble and keep things local while still accessing your data remotely.

Answered By BatchGuru101 On June 8, 2025

Are you thinking of running your EC2 with 1GB chunks, processing one batch at a time? If that's the plan, you really have to upload everything in 1GB increments over and over. But if you load 200TB and process it as needed, incoming data isn't charged, so maybe just export your data and let EC2 handle the heavy lifting without fear of those S3 fees.

CloudDancer92 - June 10, 2025

That makes sense. Yes, I'll be processing the batches sequentially on the same instance. Just trying to figure out the most cost-effective way to manage all this data!

How Can I Mount My Local SSDs on an EC2 Instance for Limited Batch Processing?

5 Answers

Related Questions

How To Get Your Domain Unblocked From Facebook

How To Find A String In a Directory of Files Using Linux

LEAVE A REPLY Cancel reply