Best Async-Native Alternative to Celery for I/O-Heavy Workloads?

0
6
Asked By DevWizard42 On

I'm working on Rhesis.ai, which tests LLM applications using a FastAPI backend along with Celery for task processing. Our workload is quite I/O-heavy as we perform numerous external API calls followed by LLM API queries, like those from OpenAI, to assess outputs.

Currently, we handle parallelization within a Celery task by utilizing asyncio. For example, if we have a test set of 50 tests, we send all requests at once and also evaluate them concurrently. However, we're limited by the number of Celery worker processes, which increases RAM usage.

We're looking for a solution that fully utilizes async capabilities—namely, a setup where tasks can be continuously scheduled onto an event loop without being restricted by a set number of worker processes or threads. FastAPI handles many concurrent requests effectively, but it lacks task queuing.

We considered Dramatiq, but it seems to have similar constraints as Celery, where tasks are executed sequentially by the workers despite being internally async. Ideally, we want a stable and mature library or architecture that can handle this proactive scheduling of tasks on an event loop without waiting for previous tasks to complete. We're wary of using experimental solutions for a core part of our infrastructure.

6 Answers

Answered By GigaByteGuru On

You might not even need to switch from Celery. If you use a simple monkey-patch with `gevent`, that could give you the concurrency you're after. Just start your worker like this: `celery worker -P gevent --concurrency=100`.

Answered By CraftyCoder99 On

Have you thought about using ZeroMQ to build your own solution? It’s not overly complex; you could do it in about 50 lines of code. Just make sure to study up on task routing though, depending on how your tasks behave.

Answered By TechNinja88 On

Check out Oban. I’ve used the Elixir version and really liked it. It’s backed by PostgreSQL/SQLite for infrastructure, which is solid.

Answered By QuestForKnowledge67 On

Have you checked out TaskIQ? It’s basically async Celery, and it might fit your needs perfectly.

Answered By Pythonista123 On

There’s also the ARQ library, created by the Pydantic guy, but just a heads-up: it's still in beta and may not be the most reliable option right now.

Answered By AsyncChampion45 On

You might want to look into Temporal. It's designed for workflows and allows a single worker to pick up new tasks while waiting for the I/O to finish. Just keep in mind it might be a bit overkill for your use case since each request becomes a workflow.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.