Advice Needed for Setting Up Production-Ready Azure Foundry Deployments

0
5
Asked By TechieExplorer47 On

I'm seeking insights from anyone experienced in Azure OpenAI and Microsoft Foundry, as our development efforts are transitioning toward this direction. Currently, we have several development deployments involving Azure OpenAI resources along with infrastructure components like private endpoints and an API Management (API-M) gateway. This setup is functional but limited to Dev licensing across multiple resource groups. Users connect to the gateway via API using subscription keys, and API-M interacts with OpenAI through private endpoints.

Now, we're aiming for a robust production setup and would love to hear from those who have tackled similar challenges. We have audit requirements, making the API Management gateway essential for recording prompts and responses. Additionally, our users seek more flexibility than just OpenAI models, hence our interest in Foundry, along with features like blob storage and AI search.

With Express Routes operational, we plan to switch API-M to private endpoints soon. What's the best approach? Should we centralize API-M with several Foundries behind it, or stick to individual deployments? How do others manage authentication and chargeback for costs associated with centralizing resources? Considering the hefty price tag of API-M, I'm uncertain whether to centralize or go with multiple individual instances, which would increase costs significantly. Any shared experiences or advice would be greatly appreciated!

1 Answer

Answered By CloudWizard99 On

Implementing Foundry can be quite complex due to the extensive resources needed for an enterprise-level platform. Consider your organization's size and the regulatory requirements you must meet. With Foundry v3, the current best practice is to assign a Foundry instance to each business unit, allowing for model deployment within those guardrails. In some cases, the need for API-M could be minimized if you handle your endpoints through Foundry using role-based access control (RBAC). Just be aware that sharing endpoints across multiple applications may require a dedicated team to manage token quotas and operations effectively.

APIHelper55 -

The main requirement is to log all prompts, both requests and responses, and right now, API-M is the optimal choice for this. Just a heads up, I reached out to you through chat!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.