What are some cost-effective alternatives to Sagemaker Realtime Inference for deploying an Open Source VLM on AWS?

0
10
Asked By TechTurtle99 On

I'm interested in deploying an OCR model found on Hugging Face (specifically, the rednote-hilab/dots.ocr model). I have experience using Sagemaker Realtime endpoints, but I've found them to be quite expensive. I'm looking for cheaper alternatives that can also minimize cold start times, so any suggestions for deploying on AWS that fit this criteria would be greatly appreciated!

1 Answer

Answered By CloudyDreamer88 On

Serving models can get pricey. If you're looking to save costs, you might want to check out Bedrock since it’s more serverless-centric, potentially offering some low usage benefits. Just keep in mind that it could still be costly per token. If that doesn't suit your budget, options outside of AWS like Digital Ocean, Lambda Labs, or Runpod might be worth exploring.

DigitalNomad42 -

I agree, but I heard Bedrock has some limitations on model compatibility. It's always a bit tricky to find something that meets both budget and functionality if you're tied to AWS.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.