I'm exploring the idea of implementing a scale-to-zero feature for Cloud Run. The scenario involves a client initiating a request, which could use protocols like HTTP, TCP, or UDP. Here's what I envision: when the request hits the node, it should forward the connection if a container is already available. If no container is ready, it would need to start one before forwarding the connection. I'm considering using a load balancer (perhaps leveraging eBPF or a custom application) to handle connection termination. However, I'm concerned about potential connection timeouts while the container initializes. I'm thinking CRIU could help with quicker boot times. Are there existing projects that already tackle this issue?
5 Answers
There are a ton of ways to implement this! Essentially, you'll need a system to receive requests and manage a pool of running containers. When there aren’t any ready, it should buffer requests while spinning up new nodes. If you're aiming for efficiency and want to write custom code, projects like Pingora could be great, but if you want something simple, SystemD could suffice. And yes, watch out for cold start problems; they can definitely lead to timeouts while your containers boot up. Here's a list of open source systems you might find useful!
One simple method to achieve this is through systemd socket activation. You could have a daemon that stops when it's idle, making it efficient. Just ensure your restart policy in systemd fits your needs!
Using Nginx with Lua could do the trick, but keep in mind that there will likely be a delay for the initial request while the Docker container starts up. Just something you'll need to plan for!
You're essentially recreating the concept behind serverless functions. There are loads of implementations that do this already, so you won't be alone in your journey!
Can you share some examples? I'd love to check them out!
Looks like we're revisiting some old tech ideas here!
If you're going DIY, you'll need two main processes. One will act as a watcher that's always listening for connections while remaining lightweight. It checks if a container is ready to service the request and then forwards it. That said, it can get quite complex based on the stack level. For instance, dealing with HTTP or TCP requires different approaches, especially concerning security protocols and health checks after startup. This illustrates why many people suggest going with a serverless platform instead. Here are some resources:
1. AWS Lambda's anniversary overview.
2. Knative documentation.
3. A guide on building your own serverless system.
Thanks for those links! They’re super helpful!

Awesome! This seems to be a solid solution, thank you!