System Operations

How to Safely Migrate a K8s Cluster and Improve Data Redundancy?

May 23, 2025

Asked By CuriousCoder42 On May 23, 2025

We're dealing with a pretty old K8s cluster made up of five physical servers (one master and four worker nodes). This cluster, while labeled as a development setup, actually runs crucial applications like a password manager, Nextcloud, and helpdesk tools without any backup solutions in place. The persistent volumes (PVs) for these apps were configured using OpenEBS Hostpath, which means they are bound to the nodes where they were created.

Looking to improve our situation, we're thinking about migrating these volumes to an NFS setup to prevent data loss if a node goes down. We also need to implement proper RAID (at least RAID-1) on these servers. However, we're constrained by resources—we can't afford any spare servers at the moment.

Our main goals are to:
- Migrate PVs to NFS
- Back up critical data using a tool like Velero
- Reinstall servers to ensure proper RAID configuration sequentially, starting from the master node.

How should we kick things off with a system that currently doesn't have RAID-1? We're hoping to transition everything gradually while minimizing downtime for users of these internal applications. Any insights would be greatly appreciated!

3 Answers

Answered By BackupBuddy99 On May 26, 2025

The main issue you seem to face is potentially losing data due to the lack of redundancy. Since your pods are tied to the PVs and those PVs to specific nodes, you can't drain a node without handling those PVs first. For now, focus on moving the PVs to NFS to avoid data loss during migration. Your immediate priority should be keeping that data safe.

Answered By TechSavvyDude On May 25, 2025

Start by decommissioning one worker node to set up RAID properly. Once that’s configured, you can promote it to master by removing the old master and turning it into a worker. After that, work through the other nodes in a similar manner.

Ideally, a fresh cluster using a solution like Talos could solve a lot of issues in the long run, but you first need to move the PVs to avoid any pod disruptions before you can drain nodes.

Answered By ClusterMaven On May 23, 2025

Your cluster situation looks tough! Just remember, with K8s, the storage must be set up in a way that allows for high availability (HA). You might not need RAID-1 on all drives, but it helps for boot drives. I recommend looking into solutions like Rook Ceph or Longhorn for distributing data across multiple nodes rather than sticking to just NFS—it offers better resilience and performance.

How to Safely Migrate a K8s Cluster and Improve Data Redundancy?

3 Answers

Related Questions

Can't Load PhpMyadmin On After Server Update

Redirect www to non-www in Apache Conf

How To Check If Your SSL Cert Is SHA 1

Windows TrackPad Gestures

LEAVE A REPLY Cancel reply