I'm working on creating a rate limiter that isn't just based on the number of tasks, but rather on the cost associated with each task. I've noticed that all the common rate limiting strategies, such as leaky bucket, token bucket, and sliding window, typically define limits like '100 tasks per second.' However, many tasks aren't equal; for instance, one task might send 10 bytes over the network, while another sends 50 bytes. Therefore, it makes more sense for my use case to gauge limits in terms of total cost or weight of the executed tasks within a given time frame.
Specifically, I'm looking for a rate limiter that:
1. Throttles based on total cost instead of the number of tasks
2. Provides strict sliding window guarantees
3. Is compatible with both standard and asynchronous functions in Python.
Has anyone here developed or used something like this? I'd love to hear your thoughts or experiences with similar utilities, as I am all set to implement my own for both learning purposes and practical uses.
3 Answers
When you think about how strict your rate limiter needs to be, consider factors like task cost estimation at the start. Do you want to dynamically adjust based on actual costs once tasks finish? Also, keep in mind that optimizing to too granular a level can sometimes create more headache than it's worth. Each use case is unique, so there's no one-size-fits-all solution when it comes to rate limiting.
The type of rate limiting really depends on what you're looking to manage. For example, in one project, I had to limit how often I queried multiple databases because too many simultaneous requests were hindering performance. Instead of querying the databases directly, I set up a caching system that updated values at scheduled intervals. My strategy involved gathering all requests in a batch and randomizing their execution to prevent overwhelming any single database. This method worked well for spreading loads over time without extending wait times too much on the monitoring end.
Have you checked out throttled-py? It might help you build what you're envisioning since it offers a degree of flexibility for cost-based limits.

That's an interesting approach! I'm aiming for a more generic solution that could apply to various contexts, like API calls or message transmissions, where the cost for each task varies. Traditional limiters usually focus on the number of tasks without accounting for resource consumption, so I'm looking to implement limits based on how much a function actually affects the system.