How Can I Optimize S3 Costs with EFS and File Processing?

0
0
Asked By TechWhiz247 On

Hey everyone! I'm facing a challenge with my system that has thousands of devices producing around 12 million XML data files daily, mostly small ones under 128KB. These files are stored in an S3 bucket, but the cost has piled up to about $600 monthly just from the frequent file listings every two hours as I need to extract device serial numbers and timestamps from them. I'm considering switching to Amazon EFS for temporary storage of these files, and combining them into larger files every hour before moving them back to S3, which would help in minimizing the number of objects and lower listing costs. Is there a way to execute an event trigger on EFS when files are created or modified, akin to S3 events that trigger Lambdas or notifications? Also, for context, the storage structure involves organizing the files by device and date. Any suggestions or solutions would be greatly appreciated!

4 Answers

Answered By DataPro732 On

We’ve built a layer over S3 for handling uploads in bulk. It's tailored for devices with usage patterns like yours, and I’d be happy to discuss the details if you want to connect!

Answered By GadgetGuru21 On

Are the files deleted after processing, or do you keep them long-term? Since they’re small, have you thought about using an SQS FIFO queue for processing directly? If you need long-term storage, after processing, send them to Glacier. Also, remember to set up an S3 gateway endpoint in your VPC to avoid internet data transfer costs!

TechWhiz247 -

I keep them for about two years in S3 Glacier. Most are small but there are bigger files too. SQS wouldn’t fit my setup unfortunately.

Answered By EcoSaver99 On

If your devices can push files directly to S3, consider structuring them to upload by device and time. Enable S3 event notifications to trigger a downstream process that batches and combines the files. For devices using Linux, consider `inotify` for triggering processes on file creations if you go that route.

TechWhiz247 -

But won’t `inotify` be incompatible with EFS?

Answered By FileMasterX On

We handle a lot of small files similarly using EFS with metadata stored in DynamoDB. We aggregate these files into larger ones for S3, compressing each record as needed. This way, we optimize our S3 costs by ensuring files are over 128KB, allowing better transitions to lower-cost storage options.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.