Can I use EC2 Spot Instances with Lambda for Serverless GPU Computing?

0
1
Asked By Techie234 On

I'm currently using RunPod to serve AI models, but their serverless option has been too unreliable for my production needs. I know AWS doesn't provide serverless GPU computing out of the box, so I'm wondering if I can set up a solution where a Lambda function triggers an EC2 or Spot instance to run a FastAPI server for inference, then automatically shuts down the instance after I get the response. I need this to work for multiple users at the same time. My plan is to use Boto3 for this setup. Is this a workable solution, or is there a better approach I should consider?

4 Answers

Answered By DataNinja2023 On

I've had customers ask for a RunPod-like experience on AWS, but without always-on servers, it’s tough. GPU availability is a huge concern; you can't guarantee on-demand. Spot instances make it trickier. To improve speed, consider a pub-sub architecture where the front-end posts a message, and a worker processes it and sends back the results. I’ve been using EKS with HPA and Karpenter for similar tasks. The HPA triggers workers based on queue metrics, and Karpenter scales based on what's available. This might help you avoid capacity issues.

Answered By CloudGuru92 On

Starting an EC2 instance can take a long time, so your users might get frustrated with the wait. If they're expecting quick responses, you might want to reconsider the timing aspect.

Answered By ServerlessFanatic On

If you’re using Spot instances, are you really going serverless? Serverless typically means not managing virtual machines. Even though you might only be handling the EC2 instances temporarily, it's still more management than a true serverless setup. But hey, it’s just semantics, right?

Techie234 -

Starting, executing one call, and shutting down an EC2 might not be considered full management, haha.

Answered By DevSavant On

You could have your API server send a message to SQS, then use EventBridge to trigger ECS jobs. ECS can utilize GPUs as well, and this means you only deploy infrastructure when it's needed. Good luck with it!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.