Understanding NFS Behavior Under Heavy Load: How to Manage Write Queue Growth?

0
5
Asked By TechNinja42 On

I'm trying to figure out some behavior I'm observing with NFS under heavy write loads. Here's my setup: I have a Linux client with NVMe storage connected to a Synology 1221+ NAS server over a 1 Gbps link. I've tested both NFSv3 and NFSv4.1, configuring rsize/wsize to 1M and using hard mounts with noatime, and I've also tried using `nconnect=4` to distribute loads across multiple TCP connections.

During heavy writes (like using rsync), I see throughput hitting around 110-115 MB/s, which is expected for a 1Gb connection. But here's the issue: as I push the load, `nfsiostat` shows that the average queue on the client is growing significantly, reaching 30-50 seconds despite TCP metrics looking fine—the RTT is low, and the NAS is mostly idle.

I've tried a few things like limiting transfer speeds with `--bwlimit`, which stabilizes the queue, but my main concern is I need the NFS mount to behave more like a slow disk without needing to implement application-specific limits. I'm curious if this behavior is expected under these conditions or if there are better ways to manage backpressure on the NFS mount without affecting global settings, especially since I want to keep my other workloads untouched. Any insights?

3 Answers

Answered By NetMasterX On

Have you tried running in async mode? This can sometimes help manage the throughput better. Just keep in mind, you might lose some of the speed benefits in sync mode, but it can help with stability. Also, just to throw it out there—if there are any switches in your network path, it’d be worth looking into flow control settings on those interfaces.

Answered By ServerGuru81 On

It sounds like you've got a classic case of the Linux page cache just doing its job, but not in a way that suits your needs. Your page cache is happily buffering data way past what the network can handle, which is why you're seeing that queue growth. The tuning you applied with BDI is definitely a step in the right direction, as it tells the kernel to take the NFS mount speed into account. If you need more control without messing with global settings, consider checking out cgroups v2 to enforce I/O limits per application.

Answered By DataWhiz53 On

Thanks for sharing your solution! It's great to hear that adjusting the BDI settings worked for you. If you're open to upgrading your hardware, consider moving to a faster connection like 10GBASE or even 25GBASE. That could alleviate a lot of the issues you're facing without needing to dive deep into software tweaks.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.