Hey everyone, I'm new to my role as an IT Director and inherited quite a hefty setup: a rack filled with about a petabyte of CEPH storage. I have two partitions set up on it that are accessible via samba/cifs. Everything runs on open-source software, including rook, ceph, talos-linux, and kubernetes. I've been using tools like talosctl and kubectl to manage things, but I've hit a couple of snags. First off, one partition keeps throwing an 'out of space' error even after I increased its size by 100Ti using kubectl. There aren't any useful clues in the SMB logs to help me troubleshoot. The second issue is performance; users are facing slow read and write speeds, which I suspect might be due to networking problems between my rack and our campus network. If anyone here can help me untangle these issues or knows someone who can, I'd really appreciate it! Thanks!
3 Answers
It sounds like you've got a pretty complex setup there! When you increased the storage size, did you ensure that the provisioning services recognized the new size? Sometimes, after expanding a partition, you need to refresh the configuration or resources for those changes to take effect properly. It might not be as straightforward as you think! Also, make sure to check if there are any other settings or quota limits that could be preventing more data from being written.
Have you checked out any professional services for assistance? Companies like Sidero offer support specifically for these kinds of open-source storage solutions. They might be able to provide the expertise you need to troubleshoot your issues effectively.
Regarding the performance issues, definitely check the network setup. If your rack is on a different numerical network, there could be some routing or bandwidth issues affecting data transfers. It's worth doing a speed test or checking for packet loss between your rack and the campus to pinpoint any networking problems.
Great advice! I’ll run some diagnostics to see if there are any latency issues.

Thanks for the tip! I hadn't thought about provisioning needing to refresh. I'll look into that more closely.