Hey everyone! I'm diving into programming and development, and I've got this project idea where I want to use various spare devices to create a cluster for running a small AI model. I'm particularly interested in using technologies like Docker, Kubernetes, and Cloud computing to learn more and tinker around. My plan is to split the model's inference and RAM requirements across these devices, likely using models like Llama.cpp or LMStudio. But, I'm a bit lost understanding Kubernetes and its role in this process.
I've been searching for information, but most resources are either too confusing or just don't explain things clearly. Can Kubernetes help with this setup? Do I need an external hosting service like Amazon AWS, or can this all be done on a private network? Also, if you could explain how Kubernetes fits into running this kind of project, I would really appreciate it. Thanks!
1 Answer
It sounds like you're really eager to learn, which is awesome! Just to clarify, Kubernetes is primarily useful when you need to manage multiple containers across a cluster of devices. So yes, you can technically split a model across different nodes and manage that with Kubernetes. But you’d still have to do the work of writing software to split the model properly. It might be helpful to rethink your approach and focus on solving specific problems instead.

I appreciate your feedback! It’s tough trying to pinpoint what problem I’m solving with Kubernetes, but I got into this idea because I’ve seen others build Pi clusters for similar tasks. It’s just frustrating with all the misleading info out there—I want to understand the 'why', not just the 'what'. It really feels like a challenge just to find the right resources.