I'm currently working in a large healthcare organization with around 9,000 employees, managing about 400 Windows servers (mainly hosted on VMware ESXi) along with a few Linux servers. We're transitioning from partial support with a managed service provider back to full in-house management in about 9 months. I'm looking for recommendations on monitoring and alerting tools that are reliable for tracking disk space, resource usage, service states, and ping responses. If you had experiences with particular tools—good or bad—I'd love to hear your thoughts. Thanks!
3 Answers
I’d definitely recommend Zabbix or Nagios if you want something more customizable. Zabbix has been my personal favorite, especially for larger networks. You can segment them into multiple servers for better performance, which is great with your 400 servers.
We’ve been using PRTG, and while it's not perfect, it has been pretty solid for our needs. Just keep in mind they recently changed their pricing, which has some users looking for alternatives. But overall, it works for monitoring a range of metrics.
Totally agree with you! We also use PRTG, but with the new pricing, we’re planning to find something else soon.
CheckMK is another solid option. We use it across different client setups, from small companies to large corporations. It's highly customizable and handles everything you need pretty well.
Can you do everything necessary with the raw version, or is it worth moving to the paid one?

Sounds like Zabbix could be a good fit! I'd love to hear more about how you've implemented it.