I'm in charge of the infrastructure design for a globally-scaled SaaS solution that heavily relies on virtual machines in a standard n-tier setup (web/app/sql). We generate OS and application images using CI/CD pipelines and deploy them with Terraform. Our workload follows a consistent pattern: high during regional business hours and slow during off-peak hours. Currently, we're operating around 200 VMs (Standard D16as v5) in six global regions.
I thought Azure VM Scale Sets would be a perfect fit since they can dynamically scale, allowing for, say, 10 VMs at 2 AM and ramping up to 200 VMs at 8 AM. However, I've run into significant issues:
1. VM Scale Sets require available capacity in your subscription, which isn't usually a problem.
2. There must also be available capacity in the region, which is a major issue; if not, the scaling fails without notification.
3. The solution seems to be purchasing Azure Capacity Reservations, but they come at full price and don't provide saving options, leading to high costs.
In busy regions like East US 2, using VM Scale Sets without these reservations feels like a disaster waiting to happen. It feels like the promise of only paying for what you need when you need it doesn't hold up because of these capacity issues. Am I missing something? Is VM Scale Sets really not made for this kind of workload?
4 Answers
Have you tried using the flexible size model? It allows for mixing different types of VMs, which might help alleviate some of your capacity problems. You could also look into Azure Batch or Kubernetes; they might require more initial setup, but they keep trying to scale until they reach the desired count.
VM Scale Sets aren’t broken, but they certainly operate differently than the marketing suggests. The actual constraint is the regional capacity. If you’re continually trying to scale 200 D16as v5, it might be time to rethink if elastic compute is appropriate for your workload. Happy to help you map out the decision tree between VMSS, reservations, and workload partitioning!
Moving to a nearby region can sometimes help if you’re facing systematic capacity issues. Have you thought about migrating part of your workload while ensuring your latency remains within acceptable limits through traffic shaping?
VM Scale Sets are great for IaaS with cyclical demands, but the real issue lies with Microsoft’s resource availability. Have you considered orchestrating the purchase of Capacity Reservations just before you need them? You could set this up with a small Azure Function or automation task.

Related Questions
Biggest Problem With Suno AI Audio
Ethernet Signal Loss Calculator
Sports Team Randomizer
10 Uses For An Old Smartphone
Midjourney Launches An Exciting New Feature for Their Image AI
ShortlyAI Review