I'm working with an API that has a limit of 4 requests per second and allows for 8 concurrent requests. This isn't enough for my needs, so I'm focusing on caching the results. Here's my current approach:
1. I check a cache table keyed by the exact query (case-insensitive and whitespace-trimmed).
2. If there's a cache hit, I return the results.
3. If not, I make the API call, store the results in the cache, and then return them.
However, I acknowledge that cached results could become stale, so I refresh the data if the cache hit is older than a specific number of days. I'm curious if there are any improvements I can make to my caching strategy or if I'm on the right track. Should I base caching on unique queries? What do you think?
5 Answers
You should definitely consider whether the API has any legal guidelines for how long you can cache data since that can affect your strategy. But if they recommend caching, you're likely in good shape!
What you're doing is a solid approach known as hybrid caching. Some libraries can help automate the process for you, but if your setup is straightforward, you might not need one.
It's really just a database lookup; I think you can manage it without a library pretty easily.
What library are you using? I built my own too!
Just keep in mind that if you exceed the API's request limits, especially if you have many unique requests at once, you might hit a bottleneck. Consider implementing a request queue to manage outbound rates better.
When caching, remember to create a cache key based on the inputs to your API call, whether they’re in the URL, GET parameters, or POST data. Also, set an expiration time for cached results to keep your data fresh.
It's a bit unclear what this API does, but depending on its complexity, your current architecture might still lead to rate limits. Make sure to validate your query parameters and possibly implement an async cleanup for old cache entries if you're using a database. Consider using an LRU eviction policy if you're caching in memory.

That's a good point, but the API explicitly suggests caching their results.