I'm working in an enterprise setup with over 9000 daily backup processes using Netbackup. Currently, I'm achieving a backup success rate of around 98%, but after some recent changes, I'm curious if it's possible to consistently reach a 100% success rate or if that's just wishful thinking. Can anyone share their experiences or insights on this?
5 Answers
In theory, 100% might seem unrealistic, but with a stable environment and the right tools like Rubrik, you can get pretty close. I've seen up to 99.999% success because any occasional failures are quickly retried. The key is really ensuring everything runs smoothly and addressing issues proactively before they become bigger problems.
Consistency in backups is crucial, and there are several factors at play. A solid backup solution like Veeam is essential, but your entire environment affects performance too. The network needs to be robust enough to handle incremental backups, and sometimes the underlying hardware just isn't up to par. Even a shaky OS can lead to numerous backup issues that might only surface when the system crashes.
Right? It's like we're exposed to all the hidden issues once something goes wrong!
For me, 98% is a good benchmark. As long as you're on top of any failures, that shows your backup system is working well. I've been hitting 99% in a smaller setup. The bigger your environment, though, the more complexities you face, but it's achievable with the right tools and management.
What’s your secret to maintaining such a high rate? Any specific software you prefer?
It sounds like hitting 100% consistently is tough, but if you're aiming for a high rate, you should shoot for 98% or more. I achieved almost perfect backups at a previous job, but that was in a controlled setup without too many variables. The more chaotic your environment, the more realistic it is to expect some failures here and there.
I’ve been using Commvault and regularly hit 99% or higher, but it heavily relies on the environment. Certain issues, like aging OS versions, can definitely drag those numbers down. Regular assessment of your infrastructure is key to staying on top of things.

How do you manage that level of reliability across all your systems?