I'm in a bit of a bind here and could really use some advice. Our company just started using AWS for R&D, particularly to experiment with open-source VLMs like paddleocr-vl for document understanding and key-value extraction. We plan to set up on-prem GPU servers, but that's still a few months away.
When I created our AWS account about two days ago, I immediately tried to launch g6.2xlarge instances (both spot and on-demand), but I hit a wall due to service quota limits, which are set to 0 by default. I requested quota increases in two regions—Frankfurt and Spain—for 8 vCPUs each. Unfortunately, all those requests were denied with some generic message about needing to ramp up gradually to avoid unexpected bills. After appealing, I spoke with customer support but haven't heard back, which has added to my frustration.
I find it hard to believe that I can't access a GPU server, especially for a proof of concept that won't take long at all. I just need to download a model from HuggingFace and process some documents—it's not like I'm trying to train a complex model!
So now I'm wondering what my options are. Should I just spin up random instances for a few months to show that we can handle the billing? I've thought about checking out Azure and GCP, but they seem to have similar issues. Has anyone else faced this problem? Any suggestions would be greatly appreciated!
2 Answers
Have you considered using a different instance type like G6.XL or perhaps an older generation in the G class? Sometimes those limits are a bit more lenient. It might be worth looking into that as a workaround!
You might want to check out the Stockholm region (eu-north-1) for better chances with your service quota increases. Also, if you’re part of an organization, try reaching out to your account manager. They usually have more leverage in getting these requests approved. But since you mentioned you just opened the account, you may not have one yet.

Yeah, I just opened the account so an account manager is probably a long shot right now.