How to Deploy a Service on Multi-GPU Instances with Docker Swarm?

0
3
Asked By TechieGuru42 On

Hey everyone! I'm currently running a service on a single instance GPU using Docker Swarm, and now I've been asked to test deployment on multi-GPU instances. I thought I had everything set up correctly, but I'm running into issues. The service either only starts one container, leaving all other GPUs idle, or it doesn't recognize the other GPUs and starts all resources on the same GPU. I've got my Docker daemon configured with the NVIDIA runtime, but I'm still seeing both containers using the same GPU based on the `nvidia-smi` output. Here's the part of my stack configuration: I've defined a service with a specific GPU reservation but it doesn't seem to work as expected. Any suggestions on what I might be missing? Thanks!

2 Answers

Answered By DevDabbler22 On

I haven't tried it personally, but you could look into using the `resources -> devices -> capabilities -> device_ids` option in your configuration. Creating separate services instead of just replicas might also be the way to go here.

Answered By QuantumCoder87 On

If you're comfortable with command line flags instead of Docker Compose, you can specify which GPU to use directly with something like: `docker run --gpus "device=0" your-image-name` for each instance. Just adjust the device number for each container you spin up!

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.