I'm working on a web app that utilizes AI, and I've been struggling with how to manage user access without them abusing their token limits or budgets. I have access to LLM instances through providers like Azure and LiteLLM, but I don't want to provide the same AI API key to all my users. My question is, how can I allocate individual keys or set specific budgets for each authenticated user? Do any frameworks like Vercel AI SDK or Pydantic AI offer built-in solutions for this? Also, I'd love any recommendations for communities to join where I can ask for more advice on implementing AI securely in production apps.
4 Answers
I'd vote for implementing this yourself! A custom solution can efficiently monitor user requests and enforce quotas. You would manage a system that tracks how many tokens each user consumes and restricts usage once their budget is hit.
You might not need AI-specific methods here; general access limitation techniques can work just as well. A solid approach is to create a middleware layer that tracks each user's usage of the LLM services and throttles them once they exceed their limits.
Consider using a database, like PostgreSQL. You can log each user's token usage and check their budget before processing requests. This way, you only need one API key for everyone while still monitoring their individual usage.
Building a solution with a database like SQLite sounds good too! You can link users' IDs to their usage attempts. Sure, some might try to game the system with multiple accounts, but it’s a start.
Related Questions
Fix Not Being Able To Add New Categories With Intuitive Category Checklist For Wordpress
Get Real User IP Without Installing Cloudflare Apache Module
How to Get Total Line Count In Visual Studio 2013 Without Addons
Install and Configure PhpMyAdmin on Centos 7
How To Setup PostfixAdmin With Dovecot and Postfix Virtual Mailbox
Dovecot Error Unknown database driver mysql