Hey everyone! I'm new to system administration after 25 years in development and I'm diving into managing cloud web applications. I've noticed that around 60% of my server traffic comes from bots and malicious crawlers, which is really taxing my resources. Currently, I'm using the free version of CloudFlare but I'm not impressed with its ability to effectively reduce malicious bot hits. I've also tried BunkerWeb, but I didn't see much improvement compared to CloudFlare and I ended up with quite a few false positives that have my team spending time trying to resolve them. My main concern right now is not security per se, as I think I'm managing that well, but these relentless attacks are becoming an issue. Here's a log I collected from the past couple of days: https://imgur.com/a/3HHng6h. This is my first post here, so I apologize for any mistakes or if I'm in the wrong place.
1 Answer
I don't use a traditional WAF, but I find HAProxy super helpful. It lets me rate-limit requests and utilize sticktables effectively, especially counting 404 errors. If someone gets more than 5 of those in a short period, I block them. This works great against bots that are often rapid crawlers. Also, blocking certain URIs and bad bots with existing lists can make a difference.

Thanks for the tips! I've got some of those blocks set up on the free CF tier, particularly for certain WordPress paths and URL extensions like .php, which seem to be working well. I also use fail2ban to limit rates on some phony pages, and it integrates with CF through the API and works really well.