How to Safely Migrate a K8s Cluster and Improve Data Redundancy?

0
7
Asked By CuriousCoder42 On

We're dealing with a pretty old K8s cluster made up of five physical servers (one master and four worker nodes). This cluster, while labeled as a development setup, actually runs crucial applications like a password manager, Nextcloud, and helpdesk tools without any backup solutions in place. The persistent volumes (PVs) for these apps were configured using OpenEBS Hostpath, which means they are bound to the nodes where they were created.

Looking to improve our situation, we're thinking about migrating these volumes to an NFS setup to prevent data loss if a node goes down. We also need to implement proper RAID (at least RAID-1) on these servers. However, we're constrained by resources—we can't afford any spare servers at the moment.

Our main goals are to:
- Migrate PVs to NFS
- Back up critical data using a tool like Velero
- Reinstall servers to ensure proper RAID configuration sequentially, starting from the master node.

How should we kick things off with a system that currently doesn't have RAID-1? We're hoping to transition everything gradually while minimizing downtime for users of these internal applications. Any insights would be greatly appreciated!

3 Answers

Answered By BackupBuddy99 On

The main issue you seem to face is potentially losing data due to the lack of redundancy. Since your pods are tied to the PVs and those PVs to specific nodes, you can't drain a node without handling those PVs first. For now, focus on moving the PVs to NFS to avoid data loss during migration. Your immediate priority should be keeping that data safe.

Answered By TechSavvyDude On

Start by decommissioning one worker node to set up RAID properly. Once that’s configured, you can promote it to master by removing the old master and turning it into a worker. After that, work through the other nodes in a similar manner.

Ideally, a fresh cluster using a solution like Talos could solve a lot of issues in the long run, but you first need to move the PVs to avoid any pod disruptions before you can drain nodes.

Answered By ClusterMaven On

Your cluster situation looks tough! Just remember, with K8s, the storage must be set up in a way that allows for high availability (HA). You might not need RAID-1 on all drives, but it helps for boot drives. I recommend looking into solutions like Rook Ceph or Longhorn for distributing data across multiple nodes rather than sticking to just NFS—it offers better resilience and performance.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.