What’s the Best Way to Periodically Fetch Data from S3 in an ECS Java Service?

0
4
Asked By CuriousCoder99 On

I'm running a Java service on ECS (using Fargate) and need some advice on the best approach to periodically fetch a list of strings from an S3 object. This file contains around 100k strings and is updated a few times every hour. I'd like to pull this file at regular intervals, load it into memory within my ECS container, and utilize it for a read-only lookup to check if certain strings exist in the list.

I've already thought about a couple of methods:
- Setting up a scheduled task to download from S3 and reload into a SynchronizedSet.
- Using Caffeine or Guava cache to manage loading with auto-refresh capabilities for each objectId.

I have a few questions:
- What are some better alternatives for reloading the data?
- How can I structure the file for quicker and more reliable loading?
I'm really interested in hearing from anyone who's been in a similar situation or has tips on handling this efficiently.

3 Answers

Answered By TechWizard42 On

One way to tackle this is to fetch the file into memory when your Java application starts. Then, after a while, when the S3 file is updated, you can set up your application to listen for S3 events through an SQS/SNS setup. This way, your Java app will get alerted when the file changes, and you can fetch the updated file from S3 right away.

Answered By CacheKing99 On

Have you thought about using Redis for this? You could update your Redis Cache with the contents of the file, then query it like any regular Map in your Java app. This is scalable and efficient, though keep in mind that using Elasticache for Redis will come with some costs.

OP_CuriousCoder99 -

Yeah, I thought about this too! It just seems to lack the auto-refresh or reload functionality that I want.

Answered By DataDynamo77 On

You might also want to consider using Athena if your file is well-formatted. It offers on-demand querying which can be very handy, depending on what you're doing with the data. Since you have around 100k entries, if you have control over the data source, think about breaking the file into smaller chunks. It'll make change detection easier.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.