I've noticed that many task queuing libraries like Celery, Huey, Dramatiq, and RQ seem to revolve around a pretty heavy framework. They require you to decorate a callable, and then that callable is picked up and executed by workers. The issue is that this setup requires the workers and the controller to have the same source code. This creates unnecessary dependencies where the controller is carrying around code needed only by the workers, and vice versa. I'm curious if there are no alternatives to this heavy RPC setup or the need to build a task tracker from scratch. I really want robust features like retries and auditing, but I'd prefer to avoid this tight coupling. Am I missing something about how to break this coupling?
1 Answer
One problem with these coupled systems is that they hide the serialization and encoding from developers. This is a key area for optimization and understanding how tasks interoperate. While some libraries allow for decoupling, they're not commonly used. A good approach might be to start with a coupled system and then evolve it into a more decoupled architecture as you scale up, using standardized interfaces or protocols that teams might resist adopting.
Totally get what you mean. With something like Celery using SQS, I remember instances where messages got base64 encoded multiple times, which felt unnecessary.