Lately, our development team has been using Azure AI to deploy some machine learning models, specifically GPT models. Unfortunately, these models have unintentionally run large jobs, resulting in a hefty bill. I'm looking for advice on how we can implement governance around AI Foundry, particularly in managing and measuring costs effectively. Any tips or solutions?
4 Answers
Setting budgets and alerts is crucial. You can use Azure's cost management features to set up budget alerts, so you’re notified before overspending. This link has details on how to set it up: https://learn.microsoft.com/en-us/azure/cost-management-billing/costs/cost-mgt-alerts-monitor-usage-spending#budget-alerts. It's a great way to keep track of your expenses!
It's important to know what type of models your developers are deploying. If they’re using something from the model catalog, figure out if it’s a serverless API or managed compute. Managed compute can rack up costs quickly since it's tied to big GPU VMs. If it’s serverless, keep in mind that you get charged per token used.
There are definitely ways to limit token usage per deployment. Implementing those limits can help control costs significantly. Make sure to explore that option!
I’m currently trying to deploy AI models too but running into some challenges. Definitely taking the suggestions on budgets and alerts into account—great insights!
Also, consider creating an Alert Group that triggers a runbook to lock down the subscription until you can address the issue. It's a good safety net.