Troubleshooting My k3s Cluster Setup: Volume Issues and SQLite Corruption

0
25
Asked By CuriousCoder92 On

I'm new to this, so please bear with me! I've set up a small k3s cluster with 3 identical minisforum MS-A2 machines. Each has 2 Samsung 990 Pro drives (1TB for Proxmox and a 2TB drive as ZFS storage). All nodes are connected via a single 2.5G network switch. My configuration includes three control plane nodes (etcd), three worker nodes, and three Longhorn nodes that back up to a NAS drive.

However, I've run into some problems. During Longhorn backups, I notice some volumes go into a degraded state and then recover. This degradation also occurs at other times, but it seems worse during backups. I've experienced issues with SQLite databases, which often start the day with corruption. Additionally, I frequently face pod restarts due to API timeouts.

I suspect there might be some kind of fundamental issue at play here. Could it be that my 2.5Gb network is saturating? Any advice would be appreciated!

3 Answers

Answered By TechExplorer47 On

It sounds like you're running 3 k3s VMs on each host, which totals 9 VMs, right? You might want to reconsider that setup. Running fewer VMs could reduce overhead without sacrificing host-level availability.

Regarding the SQLite issues, if you're using NFS for your RWX volumes, that could be the culprit. SQLite doesn't handle NFS well, so maybe that's causing your corruption problem.

Answered By VirtualLabTech On

Is this a serious cluster for reliable tasks, or just a hobby project? If it’s the former, maybe look into removing the virtualization layer. Having too many layers can strain your resources. If it's just for fun, I see the appeal of using virtual machines for flexibility.

Answered By BareMetalFan On

I suggest you start by removing the virtualization layer, which can lead to resource overcommitment. Running a bare metal setup often yields better performance.

By the way, for the SQLite issues, are they on Longhorn RWX volumes? That’s something to consider, but it seems like the root problem is deeper, affecting volume stability overall.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.