Programming

Can I use EC2 Spot Instances with Lambda for Serverless GPU Computing?

May 1, 2025

Asked By Techie234 On May 1, 2025

I'm currently using RunPod to serve AI models, but their serverless option has been too unreliable for my production needs. I know AWS doesn't provide serverless GPU computing out of the box, so I'm wondering if I can set up a solution where a Lambda function triggers an EC2 or Spot instance to run a FastAPI server for inference, then automatically shuts down the instance after I get the response. I need this to work for multiple users at the same time. My plan is to use Boto3 for this setup. Is this a workable solution, or is there a better approach I should consider?

4 Answers

Answered By DataNinja2023 On May 4, 2025

I've had customers ask for a RunPod-like experience on AWS, but without always-on servers, it’s tough. GPU availability is a huge concern; you can't guarantee on-demand. Spot instances make it trickier. To improve speed, consider a pub-sub architecture where the front-end posts a message, and a worker processes it and sends back the results. I’ve been using EKS with HPA and Karpenter for similar tasks. The HPA triggers workers based on queue metrics, and Karpenter scales based on what's available. This might help you avoid capacity issues.

Answered By CloudGuru92 On May 3, 2025

Starting an EC2 instance can take a long time, so your users might get frustrated with the wait. If they're expecting quick responses, you might want to reconsider the timing aspect.

Answered By ServerlessFanatic On May 1, 2025

If you’re using Spot instances, are you really going serverless? Serverless typically means not managing virtual machines. Even though you might only be handling the EC2 instances temporarily, it's still more management than a true serverless setup. But hey, it’s just semantics, right?

Techie234 - May 4, 2025

Starting, executing one call, and shutting down an EC2 might not be considered full management, haha.

Answered By DevSavant On May 1, 2025

You could have your API server send a message to SQS, then use EventBridge to trigger ECS jobs. ECS can utilize GPUs as well, and this means you only deploy infrastructure when it's needed. Good luck with it!

Can I use EC2 Spot Instances with Lambda for Serverless GPU Computing?

4 Answers

Related Questions

Set Wordpress Featured Image Using Javascript

How To Fix PHP Random Being The Same

Why no WebP Support with Wordpress

Replace Wordpress Cron With Linux Cron

Customize Yoast Canonical URL Programmatically

[Centos] Delete All Files And Folders That Contain a String

LEAVE A REPLY Cancel reply