Why won’t my Kubernetes pod use more than 50% of the node’s CPU?

0
9
Asked By TechyBunny92 On

I'm running an application that deploys a pod on an m5.large instance using a BentoML image for a text classification model. I've set up the image with 2 workers, and the pod's memory usage sits around 2.7Gi. Despite my attempts to configure resource requests and limits for guaranteed QoS, the pod is capped at roughly 50% CPU usage. I even tested a larger instance type, which did increase the CPU usage slightly, but it still wouldn't exceed 50%. It's worth noting that adding another pod on the same node allows that pod to utilize the remaining CPU. Why is it that my single pod is limited in resource usage? I'm still new to Kubernetes, so any insights would be appreciated!

3 Answers

Answered By OpsExpert21 On

Also, just to note, this scenario is perfect for implementing metrics and observability solutions. If you check out this guide on using Prometheus, it could give you some insights into whether your pod is being throttled or hitting resource limits: [Prometheus Queries for Kubernetes](https://signoz.io/guides/prometheus-queries-to-get-cpu-and-memory-usage-in-kubernetes-pods/#how-to-query-cpu-usage-in-kubernetes-pods-with-prometheus).

Answered By CuriousCoder77 On

Sounds like your BentoML setup might be limited in how it handles threading or CPU usage. If it's single-threaded, that could explain why you're not able to utilize more than one CPU core. You might want to check if you can adjust the number of workers to see if that impacts performance.

DataDude42 -

It looks like there are specifics for configuring the number of workers in BentoML—check out their docs on parallel requests. How many workers did you set up?

CloudNinja88 -

I've noticed something similar too. If you're on the m5.large with 2 CPUs and it's hitting a ceiling of 1100m, that could indicate it's not scaling for some reason. Maybe try a different setup.

Answered By KubernetesWizard53 On

Just a heads up: if you've set CPU limits, they can restrict how much processing power your pod can access, even if it's not using it all. Have you tried removing those limits altogether and just keeping the requests? It might help alleviate any throttling issues you’re facing.

ResourceGuru19 -

Right! The requests dictate how CPU is shared across nodes, limits are more about capping it. Getting rid of those limits could let your pod use the CPU more efficiently.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.