I'm trying to figure out the best way to handle Persistent Volumes (PVs) in Kubernetes with a setup that frequently redeploys infrastructure. My goal is to create a Container Storage Interface (CSI) that provisions storage if it doesn't exist and utilizes it if it already does. I'm deploying several cloudnativepg `Cluster` instances, needing multiple images (for data and WAL) to be created across three nodes. Manually managing those images is impractical, especially since I've only been working with Kubernetes for about eight months and haven't yet dealt with production workloads.
My current setup includes:
- Control planes and worker nodes via Talos and Terraform, operating on Proxmox.
- A Ceph cluster.
- Rook managing the external Ceph cluster.
- GitOps with ArgoCD.
I want the flexibility to destroy and redeploy the cluster without relying on backups, as persistent data resides on Ceph, not in the infrastructure. While I initially thought this wouldn't fall under backup or disaster recovery, I've realized it does pertain to restoring manifests (like PVs before PVCs) to work effectively.
I initially considered solutions like proxmox-csi or local-path-provisioner that could potentially meet my needs of creating or using pre-provisioned storage. However, I also pondered developing my own operator for better management since I've been learning Go and Kubernetes operators.
Interestingly, I've already identified that I was approaching the issue incorrectly. By restoring the PV before the PVC, I don't actually need to set it as a `staticVolume`, leading to working solutions without as much manual intervention as I initially thought. I still appreciate insights or workflows, particularly if there's a streamlined way to think about PV management. Thanks for any advice!
3 Answers
I totally get your frustration. Automating the PV management in a continuous deployment scenario can be complex. Have you looked into how your manifests are restored? Sometimes, the sequence matters, like getting your PV to restore before the PVC, as you've noticed. That's key in getting your storage to link properly without the manual overhead.
You might have some luck with a custom operator as you mentioned, especially if you want to extend logic or functionality that existing tools don't provide. Learning Go will definitely pay off in the long run for building out those custom needs.
It sounds like you've made some progress, but you may want to check out the external-reprovisioner tool. It offers predictable PV name handling and helps in scenarios like yours. If you tune the provisioner's settings right, you could automate some of this process without needing to create images manually every time.
Yeah, I've had success with that for OpenEBS ZFS too. It's pretty handy for streamlining your storage provisioning.
After some testing, I figured out it works with ceph-csi/rook-ceph as well, which simplified what I was doing. Worth a shot!
Agreed! I had to learn that the hard way too. Scripting your final deployment steps can make a huge difference.