Hey folks! I'm currently working with two FastAPI apps that are calling gRPC machine learning services for tasks like layout analysis and table detection. I need to scale these services efficiently. My main question is whether to go with client-side load balancing or stick to server-side load balancing using NGINX. I'm especially concerned about how using NGINX might affect performance during GPU-based ML inference over gRPC. Here are my main worries: will I lose the benefits of HTTP/2 multiplexing, will there be noticeable extra latency (even though I expect it to be minor compared to the 2-5 second processing times), and how can I ensure priority handling for clients with time-sensitive requests? I find NGINX operationally simpler, but I want to ensure I'm not compromising on performance. Have any of you dealt with gRPC and NGINX? Is the complexity of client-side load balancing worth it in this scenario?
4 Answers
That brings up another valid point: how do you manage prioritization among clients if you're handling load balancing on the client side? It’s definitely a tricky situation.
It really comes down to whether you trust your clients to manage load balancing effectively. Client-side balancing can bring its own set of risks if clients aren't reliable.
If performance is your priority, running benchmarks is essential rather than just seeking opinions online. While 'GPU inference' can mean different things, typically most of the processing is on the GPU, not the load balancer, so if you have multiple GPUs and a decent load balancer, the latter is unlikely to become a bottleneck.
I’m curious about the concerns over HTTP/2 multiplexing. Adding an extra hop shouldn’t really diminish those benefits much. Typically, it should just add about 10-15ms in the same region, which seems reasonable.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically