I'm looking for recommendations on agent-based monitoring solutions that can report if a remote endpoint goes down. We need something that supports 'live' checks since our customer is unhappy with the 1-minute polling frequency of CheckMK SaaS. Any suggestions?
1 Answer
Polling more frequently than every 60 seconds can really drain your CPU cycles, log capacity, and network bandwidth, especially since uptime checks aren't very resource-intensive. But I totally get the frustration when you're missing downtime that's under a minute. It's a tough spot, and I feel your pain!
Exactly! We had a system go down for about 45 seconds, and with a minute polling, we completely missed it. I suggested using VPN and just doing simple ICMP checks, but my hands are tied with the current setup.