Addressing Scraping Challenges from AI Bots

February 8, 2026

Asked By TechWizard42 On February 8, 2026

I've noticed that scrapers from companies like OpenAI and Anthropic seem to ignore the rules in robots.txt. They also often target event pages looking for bizarre dates like 2139-13-45, which ends up being an exhausting process for my server. I'm looking for a straightforward solution to tackle this issue. It seems like more robust tools like mod_security are dated and often overly complicated for my needs, especially for smaller sites on shared hosting. For larger sites, I've considered using bunkerweb, but it's more involved than I'd hoped. Does anyone have lighter solutions or alternatives that could help?

5 Answers

Answered By SecureSiteGuard On February 12, 2026

Rate-limiting and honeypot pages that humans wouldn't trigger could also help. Any bot that hits those traps would get instantly blacklisted.

Answered By ScraperSmasher On February 11, 2026

I've taken a different approach: I publish their IP ranges and redirect them to a page that humorously highlights how awesome I am. It's a fun way to deal with them! Setting this up with Traefik was really easy.

Answered By ServerSleuth On February 11, 2026

While it's frustrating that scrapers ignore crawl rules, if their access to invalid pages is stressing your server, it might point to underlying issues in your site's architecture. Accessing those non-existent agenda pages shouldn’t cause significant strain; usually, it’s just one database query. But logging 21k lines just for scrapers isn’t reasonable, and I get how annoying that can be.

Answered By NinjaCoder88 On February 10, 2026

I've deployed Anubis and I've been really happy with it! It does a great job of managing pesky web scrapers.

TechWizard42 - February 12, 2026

Thanks! I'll definitely look into Anubis.

Answered By DevOpsDude99 On February 9, 2026

One solid option to consider is using Fail2ban. It's pretty popular among DevOps teams and works well for blocking those scrapers effectively.

Addressing Scraping Challenges from AI Bots

5 Answers

Related Questions

How to Build a Custom GPT Journalist That Posts Directly to WordPress

Cloudflare Origin SSL Certificate Setup Guide

How To Effectively Monetize A Site With Ads

LEAVE A REPLY Cancel reply