Hi everyone! Our research group recently got a NAS with 34 disks, each 20TB, totaling 600TB of storage. We're looking to consolidate all our research data, which is currently spread out across smaller servers (around 2TB each), and we also want to store our service data using Longhorn deployed via Kubernetes. This is my first experience working with such a large capacity, so I'm curious about the best file system to use for this setup. I've seen some discussions and it seems like ext4 might not be the best choice anymore. We have a MegaRaid 9560-16i 8GB RAID card in place and currently set up two RAID6 drives of 272TB each, but I'm open to changing the RAID configuration if necessary. For reference, we're using an AMD EPYC 7662 64-Core Processor and have 512GB of DDR4 RAM. Any advice would be greatly appreciated!
4 Answers
At this scale, you might also want to look into a SAN setup, utilizing iSCSI or Fibre Channel for increased performance and management capabilities. It really depends on your use case.
That NAS is impressive! Just remember to prioritize backup solutions, especially with that much data on the line.
Given your setup, ZFS could be a strong choice. It’s well-suited for large storage capacities, especially in a NAS environment. Just keep in mind that ZFS performs best with ample RAM and ideally SSDs for cache. Since you have 512GB of RAM, you should be in a good position for performance.
Also, make sure to consider that ZFS typically requires drives to be passed directly to the OS without RAID. You'll likely want to use HBA in IT mode.
How are you planning on backing up all that data? With 600TB, a solid backup strategy is crucial to prevent data loss in case of failure or corruption.
Good point! I'm still figuring that out. I definitely want something reliable, though.

Thanks for the insight! I appreciate the advice about the hardware requirements. Are there specific configuration tips you suggest for ZFS?