Hey everyone, I need some guidance for a project I'm working on. I'm processing batches of images and want each machine in my cluster to serve as both the source and destination for these images. Here's the workflow: I receive a hard drive with images, copy them to a machine in my cluster, then an API processes these images. The paths of the original images are stored in PostgreSQL, and I use Celery/Redis for task management, with Keda for scaling.
3 Answers
I think I need a bit more detail about your requirements. How many images are you processing at once? What sizes are we talking about? If you're looking for high efficiency, you might consider using something like Argo Workflows. But I'd really want to know more about your current workflow before making any specific suggestions.
It sounds like you're overthinking this a bit! Instead of worrying about which machine is the source or destination, why not just mount a shared storage solution? You can have your containers access the same storage, and no need for complicated routing between machines. Could you share more about your specific setup?
If your goal is for each node to act as both a source and a destination, there might be a simpler approach. Why not just schedule the Celery tasks to run on the same node as the images? That way, you're minimizing network traffic and complexity. You could categorize the workers by node to streamline the task distribution without sending around IPs.

Related Questions
How to Build a Custom GPT Journalist That Posts Directly to WordPress
Fix Not Being Able To Add New Categories With Intuitive Category Checklist For Wordpress
Get Real User IP Without Installing Cloudflare Apache Module
How to Get Total Line Count In Visual Studio 2013 Without Addons
Install and Configure PhpMyAdmin on Centos 7
How To Setup PostfixAdmin With Dovecot and Postfix Virtual Mailbox