I'm trying to wrap my head around how Railway.com manages to do usage-based pricing while still allocating resources efficiently enough to remain profitable. I'm thinking about how I could replicate their approach using a Kubernetes cluster that automatically scales vertically based on resource utilization. However, I read that for an autoscaler to function properly, it needs to know the minimum resource requirements through the requests field. That seems to imply a fixed minimum allocation for each container, which is contrary to what I've seen with Railway's usage-based charges. How do they achieve this model while still being flexible in their resource consumption?
My question is if there's a technique or operator that enables a Kubernetes cluster to dynamically scale according to actual resource usage without needing to set requests, but perhaps only limits. I'm concerned that relying solely on limits might lead to issues since pods would be packed too tightly, resulting in node pressure. If I have misunderstood anything about how Railway operates, please let me know. I'd also appreciate links to any open-source solutions that could help replicate this kind of scaling capability. Thanks!
3 Answers
From what I know, Railway doesn’t necessarily profit the way you might think! But, they typically offer users the choice of how much RAM to allocate, and if you adjust that, a new pod is spun up with the new limits. If they have a serverless model, they might use a sort of proxy pod that manages downstream pods, possibly using something like KEDA for HTTP-based scaling.
They actually don’t operate on Kubernetes. I remember seeing that somewhere in their blog; just don't have the link handy right now.
Yeah, a quick check confirms they often mention opting out of K8s for their infrastructure!
Platforms like Railway rely on aggressive overcommitment, detailed cgroup metrics, and efficient node autoscaling. Instead of billing users based on requests, they charge based on the actual CPU and memory consumed. If you want to get that kind of outcome in Kubernetes, try integrating VPA in either recommended or auto mode alongside Karpenter or the cluster autoscaler. Just don’t forget to set up a custom metering system to keep track of real usage instead of what's reserved.

Good point! It looks like they charge for reserved resources rather than just usage. For example, they bill about $10 per GB of RAM and $20 per vCPU per month, which might give them better profit margins compared to other platforms.