I've been experiencing cost spikes of over 100% on one of my Azure firewalls that I have deployed in a VWAN hub, and I'm trying to get to the bottom of it. The spikes have been ongoing for a few days, so I decided to analyze my AZFWFatFlow logs. I focused on the _BilledSize to understand the record sizes for each flow between source and destination IPs and checked the top 30 flows with the highest billed size.
After analyzing the flows, I tried to redistribute some workloads to reduce the traffic going through the firewall, but I didn't notice any significant drop in costs. I reached out to Microsoft Support for clarification, and their response was rather concerning: they mentioned that they don't have a way to directly analyze firewall costs and suggested that my analysis using the FatFlow logs isn't representative.
Given this situation, I'd like to ask for your insights. Am I missing something in my analysis? What recommendations do you have for effectively understanding and mitigating these Azure Firewall cost spikes?
2 Answers
You might not have enough info for a detailed analysis just yet. To understand where your costs come from, try this KQL query:
```kql
AzureDiagnostics
| where ResourceType == "AZUREFIREWALLS"
| where Category in ("AzureFirewallApplicationRule", "AzureFirewallNetworkRule", "AzureFirewallNatRule")
| extend SourceIP = coalesce(SourceIP_s, src_ip_s),
DestinationIP = coalesce(DestinationIP_s, dest_ip_s),
RuleName = coalesce(ruleName_s, RuleName_s),
BytesSent = toint(requestSize_s),
BytesReceived = toint(responseSize_s)
| extend TotalBytes = BytesSent + BytesReceived
| summarize TotalDataProcessed_MB = sum(TotalBytes) / 1024 / 1024 by bin(TimeGenerated, 1h), SourceIP, DestinationIP, RuleName
| order by TotalDataProcessed_MB desc
```
This should give you insight into data processed over time. Also, check the AzureFirewallPerformanceLog for throughput and resource usage metrics; it’ll help you see if you're scaling unnecessarily.
It's possible they might have limited that data to legacy logs. If you’re seeing an increase in processed data and throughput metrics, that should correlate with your cost spikes. Keep an eye on those two!
It's a bit perplexing that Microsoft doesn't have a customer-facing way to analyze firewall costs. It raises questions about how they calculate those prices. I'm not sure how you should proceed either, unfortunately, but it's worth following up with them for clearer guidance.
Totally agree! It feels a bit off when their support seems unclear about their own pricing structure. Definitely worth digging deeper.
That query looks great, but I ran into a snag. I don’t use the legacy Azure diagnostics logs for AZFW, and I can’t seem to find the request/response sizes in the current tables. Have those fields been removed in the new logging model?