Hey everyone! I haven't set up a monitoring environment in a while, the last one I worked on was Icinga right after the split. Now, we're looking at implementing a commercial off-the-shelf (COTS) system instead of our custom-built solution.
Our infrastructure is diverse, with various Linux distributions, Windows versions ranging from 11 back to XP (we're required to keep those), plus the usual hubs, switches, VMs, and physical servers. We want to monitor typical stuff like uptime, CPU and memory usage, and ensure our resources are used efficiently (like checking if those 8-core VMs are actually necessary).
I've started diving into options like Nagios, Nagios XI, Icinga2, Zabbix, Prometheus, and Grafana, but I need to write a comparison paper. I'd like to narrow it down to the top 3 or 4 tools for my research. Licensing costs are a key factor, and it's essential that whatever system we choose can monitor Windows XP.
Additionally, while we have a knowledgeable team, the ease of installation and time to deployment are also important factors. Any recommendations?
2 Answers
Check out those resources I linked! Applying metrics for monitoring is crucial these days. Traditional systems like Nagios are becoming outdated and don't provide a comprehensive view of user experience. Metrics tell you much more about what's going on with your infrastructure.
Have you looked into Checkmk? It’s fairly flexible, and even if it doesn’t directly support something, you can usually create a custom query to get the data you need. From what I know, monitoring Windows XP might be tricky, but it should be doable with a custom agent if you need to.
Is Checkmk like a fork of Icinga? I thought I’d seen it floating around before.
I get that, but aren't the user systems more about building tasks than everyday use? If that's the case, the monitoring should really focus on process statistics instead. Thanks for the links though!