I'm facing a challenge in my Go systems where we're running multiple worker pods that perform distributed tasks like consuming from Kafka topics and processing batch jobs. Here's the situation:
- We have a variable number of workers (usually fewer than 50 Kubernetes pods).
- There are several work units, which correspond to topic partitions.
- I need to ensure that each worker has ownership of a subset of these work units, distributed as evenly as possible.
- Workers may come and go due to deployments, crashes, or autoscaling.
- I require some control over throttling.
Typically, I've considered solutions such as Redis locks, central schedulers, or queues where workers compete for tasks, but these often lead to unpredictable behavior and eventual inconsistencies. I'm curious about real-world patterns others are implementing for this kind of workload, especially in Kubernetes environments.
5 Answers
Consider sharding it. You could have each pod monitor the status of replicas and dynamically reshard when pods come or go. This is similar to how kube-state-metrics scales for higher loads.
With Kafka, consumer groups work well here. Each consumer in a group takes ownership of different partitions, and if one crashes, Kafka automatically redistributes the partitions. Are you looking for something beyond that? In cases where workers are interdependent or connect to external systems, locks might still be necessary.
You're overcomplicating things! Just check out Kafka's documentation on consumer groups and partitions. You can achieve task distribution natively with Kafka without needing to resort to locks.
Have you thought about implementing a simple queuing system? You can use RabbitMQ, Kafka, or even a straightforward endpoint to distribute jobs. In fact, using tools like KEDA could help orchestrate your pod scaling based on message load.
Switching to RabbitMQ could be a better fit for your needs. It ensures that messages being processed by one pod are hidden from others, allowing for smoother autoscaling. Plus, KEDA can help with scaling based on message load.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically