I'm working with a CloudFront distribution in front of an internal Application Load Balancer (ALB) using the new VPC origins feature, and I have a WAFv2 connected to the ALB. I initially set up some rate limit rules using the `X-Forwarded-For` header, which effectively blocked most bots. However, I encountered a persistent bot that managed to spoof its `X-Forwarded-For` header and bypass my WAF rate limits.
I tried switching to the `CloudFront-Viewer-Address` header instead, but it didn't work as expected. The WAF logs showed that it couldn't parse the viewer's IP, marking it as INVALID, particularly since this header includes the port information as well.
I'm wondering if there's a way to set rate limit rules for CloudFront that are robust against spoofing. I considered using a CloudFront function or Lambda@Edge to create a custom header with the real viewer's IP, but I'm concerned about the additional costs and latency that might introduce.
It's surprising to me that this isn't more straightforward to set up. Am I missing something? Also, I found out that if I connect the WAF directly to CloudFront rather than the ALB, I can create rules using the client IP without needing to rely on the XFF header. The downside is that I still have to have the WAF connected to my ALB for non-CloudFront traffic, which means I'm paying for two WAFs now!
2 Answers
You’re right that clients could either be legitimate proxies or malicious actors. CloudFront should append the viewer's IP to the `X-Forwarded-For` header. However, the catch is that WAF might be taking the first IP in that header. There are discussions around this, and while AWS documentation mentions features for managing IP addresses, it seems they don’t easily apply to your rate limiting needs. It might be worth double-checking updates or reaching out to AWS support for more insights on this behavior.
To tackle the spoofing issue, definitely check if using a Lambda@Edge function to manipulate headers could help. I feel your pain about costs and complexity, though. Another option is just to simplify your WAF setup; connecting it directly to CloudFront might streamline things, even if it means managing two WAFs. It can feel redundant, but it’s sometimes necessary for coverage depending on your traffic.

Related Questions
How to Build a Custom GPT Journalist That Posts Directly to WordPress
Cloudflare Origin SSL Certificate Setup Guide
How To Effectively Monetize A Site With Ads