Hey everyone! My team manages several Kubernetes clusters that different internal teams use. We've set up strict policies to control permissions, such as preventing the creation of Custom Resource Definitions (CRDs), running root containers, and installing DaemonSets. However, some teams need elevated permissions to deploy their applications, particularly for creating ClusterRoles, ClusterRoleBindings, and installing their own CRDs. I'm looking for advice on how to strike a balance between maintaining security in our clusters and meeting these permissions requests from the teams. Are there any best practices or insights from those with experience in navigating this issue?
4 Answers
You might want to look into how you're managing isolation and permissions. Here are a few ideas:
1. Consider creating virtual clusters using vCluster. This way, each team has its own API server and control plane, allowing them to manage CRDs without conflicts.
2. A graduated permissions model might help. For example, you can implement a request system with admissions webhooks using OPA Gatekeeper or Kyverno, making permissions time-bound and managed through an approval workflow.
3. Another approach could be distilling what CRDs and roles each team really needs and managing those centrally with GitOps, where teams can propose changes. Just remember to enforce resource quotas and network policies to prevent unwanted cross-tenant communication.
As for root containers, I recommend not allowing them unless absolutely necessary, and if they are, consider isolating that risk in a separate node pool.
Honestly, the need to run as root often comes from overlooking other options. You might find that most workloads don't actually require root permissions, except for specific cases like the CNI.
For CRDs and cluster-wide objects, consider that if a cluster is shared among teams, allowing such access might not be feasible. It's about weighing the security implications against the team's actual needs.
It’s best if the cluster administrators handle CRDs using GitOps practices. This way, tenants can make pull requests for changes, which helps manage potential conflicts.
For roles and bindings, if you don't have a self-service layer in place yet, you can manage those centrally too. Regarding root containers, I'd usually avoid allowing them altogether, but if you have to, try to limit that risk by isolating workloads to specific nodes or pools. You can also restrict root access but allow certain kernel privileges depending on the workload.
There's no one-size-fits-all answer here. I'd advise the platform team to collaborate with the requesting team to understand their needs better. Often, it’s just a matter of what the platform isn't allowing and whether those requirements can be satisfied through better permissions management.
Also, you might want to look into whether these installations truly need global permissions or if they can work within a namespaced approach instead. In some cases, you can facilitate their needs through temporary clusters that they can control, helping them develop solutions without compromising security elsewhere.
Capsule could be a great alternative to explore instead of waiting for new features to evolve.