Azure Firewall monitoring data reference

This article contains all the monitoring reference information for this service.

See Monitor Azure Firewall for details on the data you can collect for Azure Firewall and how to use it.

Metrics

This section lists all the automatically collected platform metrics for this service. These metrics are also part of the global list of all platform metrics supported in Azure Monitor.

For information on metric retention, see Azure Monitor Metrics overview.

Supported metrics for Microsoft.Network/azureFirewalls

The following table lists the metrics available for the Microsoft.Network/azureFirewalls resource type.

Table headings

  • Metric - The metric display name as it appears in the Azure portal.
  • Name in Rest API - Metric name as referred to in the REST API.
  • Unit - Unit of measure.
  • Aggregation - The default aggregation type. Valid values: Average, Minimum, Maximum, Total, Count.
  • Dimensions - Dimensions available for the metric.
  • Time Grains - Intervals at which the metric is sampled. For example, PT1M indicates that the metric is sampled every minute, PT30M every 30 minutes, PT1H every hour, and so on.
  • DS Export- Whether the metric is exportable to Azure Monitor Logs via Diagnostic Settings. For information on exporting metrics, see Create diagnostic settings in Azure Monitor.

Firewall health state

In the preceding table, the Firewall health state metric has two dimensions:

  • Status: Possible values are Healthy, Degraded, Unhealthy.
  • Reason: Indicates the reason for the corresponding status of the firewall.

If SNAT ports are used more than 95%, they're considered exhausted and the health is 50% with status=Degraded and reason=SNAT port. The firewall keeps processing traffic and existing connections aren't affected. However, new connections might not be established intermittently.

If SNAT ports are used less than 95%, then firewall is considered healthy and health is shown as 100%.

If no SNAT ports usage is reported, health is shown as 0%.

SNAT port utilization

For the SNAT port utilization metric, when you add more public IP addresses to your firewall, more SNAT ports are available, reducing the SNAT ports utilization. Additionally, when the firewall scales out for different reasons (for example, CPU or throughput) more SNAT ports also become available.

Effectively, a given percentage of SNAT ports utilization might go down without you adding any public IP addresses, just because the service scaled out. You can directly control the number of public IP addresses available to increase the ports available on your firewall. But, you can't directly control firewall scaling.

If your firewall is running into SNAT port exhaustion, you should add at least five public IP address. This increases the number of SNAT ports available. For more information, see Azure Firewall features.

AZFW Latency Probe

The AZFW Latency Probe metric measures the overall or average latency of Azure Firewall in milliseconds. Administrators can use this metric for the following purposes:

  • Diagnose if Azure Firewall is the cause of latency in the network
  • Monitor and alert if there are any latency or performance issues, so IT teams can proactively engage.
  • There might be various reasons that can cause high latency in Azure Firewall. For example, high CPU utilization, high throughput, or a possible networking issue.

What the AZFW Latency Probe Metric Measures (and Doesn't):

  • What it measures: The latency of the Azure Firewall within the Azure platform
  • What it doesn't measure: The metric does not capture end-to-end latency for the entire network path. Instead, it reflects the performance within the firewall, rather than how much latency Azure Firewall introduces into the network.
  • Error reporting: If the latency metric isn't functioning correct, it reports a value of 0 in the metrics dashboard, indicating a probe failure or interruption.

Factors that impact latency:

  • High CPU utilization
  • High throughput or traffic load
  • Networking issues within the Azure platform

Latency Probes: From ICMP to TCP The latency probe currently uses Azure's Ping Mesh technology, which is based on ICMP (Internet Control Message Protocol). ICMP is suitable for quick health checks, like ping requests, but it may not accurately represent real-world application traffic, which typically relis on TCP.However, ICMP probes prioritize differently across the Azure platform, which can result in variation across SKUs. To reduce these discrepancies, Azure Firewall plans to transition to TCP-based probes.

  • Latency spikes: With ICMP probes, intermittent spikes are normal and are part of the host network's standard behavior. These should not be misinterpreted as firewall issues unless they are persistent.
  • Average latency: On average, the latency of Azure Firewall is expected to range from 1ms to 10 ms, depending on the Firewall SKU and deployment size.

Best Practices for Monitoring Latency

  • Set a baseline: Establish a latency baseline under light traffic conditions for accurate comparisons during normal or peak usage.

  • Monitor for patterns: Expect occasional latency spikes as part of normal operations. If high latency persists beyond these normal variations, it may indicate a deeper issue requiring investigation.

  • Recommended latency threshold: A recommended guideline is that latency should not exceed 3x the baseline. If this threshold is crossed, further investigation is recommended.

  • Check the rule limit: Ensure that the network rules are within the 20K rule limit. Exceeding this limit can affect performance.

  • New application onboarding: Check for any newly onboarded applications that could be adding significant load or causing latency issues.

  • Support request: If you observe continuous latency degredation that does not align with expected behavior, consider filing a support ticket for further assistance.

    Screenshot showing the Azure Firewall Latency Probe metric.

Metric dimensions

For information about what metric dimensions are, see Multi-dimensional metrics.

This service has the following dimensions associated with its metrics.

  • Protocol
  • Reason
  • Status

Resource logs

This section lists the types of resource logs you can collect for this service. The section pulls from the list of all resource logs category types supported in Azure Monitor.

Supported resource logs for Microsoft.Network/azureFirewalls

Azure Firewall has two new diagnostic logs that can help monitor your firewall, but these logs currently do not show application rule details.

  • Top flows
  • Flow trace

Top flows

The top flows log is known in the industry as fat flow log and in the preceding table as Azure Firewall Fat Flow Log. The top flows log shows the top connections that are contributing to the highest throughput through the firewall.

Tip

Activate Top flows logs only when troubleshooting a specific issue to avoid excessive CPU usage of Azure Firewall.

The flow rate is defined as the data transmission rate in megabits per second units. It's a measure of the amount of digital data that can be transmitted over a network in a period of time through the firewall. The Top Flows protocol runs periodically every three minutes. The minimum threshold to be considered a Top Flow is 1 Mbps.

Enable the Top flows log using the following Azure PowerShell commands:

Set-AzContext -SubscriptionName <SubscriptionName>
$firewall = Get-AzFirewall -ResourceGroupName <ResourceGroupName> -Name <FirewallName>
$firewall.EnableFatFlowLogging = $true
Set-AzFirewall -AzureFirewall $firewall

To disable the logs, use the same previous Azure PowerShell command and set the value to False.

For example:

Set-AzContext -SubscriptionName <SubscriptionName>
$firewall = Get-AzFirewall -ResourceGroupName <ResourceGroupName> -Name <FirewallName>
$firewall.EnableFatFlowLogging = $false
Set-AzFirewall -AzureFirewall $firewall

There are a few ways to verify the update was successful, but you can navigate to firewall Overview and select JSON view on the top right corner. Here's an example:

Screenshot of JSON showing additional log verification.

To create a diagnostic setting and enable Resource Specific Table, see Create diagnostic settings in Azure Monitor.

Flow trace

The firewall logs show traffic through the firewall in the first attempt of a TCP connection, known as the SYN packet. However, such an entry doesn't show the full journey of the packet in the TCP handshake. As a result, it's difficult to troubleshoot if a packet is dropped, or asymmetric routing occurred. The Azure Firewall Flow Trace Log addresses this concern.

Tip

To avoid excessive disk usage caused by Flow trace logs in Azure Firewall with many short-lived connections, activate the logs only when troubleshooting a specific issue for diagnostic purposes.

The following properties can be added:

  • SYN-ACK: ACK flag that indicates acknowledgment of SYN packet.

  • FIN: Finished flag of the original packet flow. No more data is transmitted in the TCP flow.

  • FIN-ACK: ACK flag that indicates acknowledgment of FIN packet.

  • RST: The Reset the flag indicates the original sender doesn't receive more data.

  • INVALID (flows): Indicates packet can't be identified or don't have any state.

    For example:

    • A TCP packet lands on a Virtual Machine Scale Sets instance, which doesn't have any prior history for this packet
    • Bad CheckSum packets
    • Connection Tracking table entry is full and new connections can't be accepted
    • Overly delayed ACK packets

Enable the Flow trace log using the following Azure PowerShell commands or navigate in the portal and search for Enable TCP Connection Logging:

Connect-AzAccount -Environment AzureChinaCloud 
Select-AzSubscription -Subscription <subscription_id> or <subscription_name>
Register-AzProviderFeature -FeatureName AFWEnableTcpConnectionLogging -ProviderNamespace Microsoft.Network
Register-AzResourceProvider -ProviderNamespace Microsoft.Network

It can take several minutes for this change to take effect. Once the feature is registered, consider performing an update on Azure Firewall for the change to take effect immediately.

To check the status of the AzResourceProvider registration, you can run the Azure PowerShell command:

Get-AzProviderFeature -FeatureName "AFWEnableTcpConnectionLogging" -ProviderNamespace "Microsoft.Network"

To disable the log, you can unregister it using the following command or select unregister in the previous portal example.

Unregister-AzProviderFeature -FeatureName AFWEnableTcpConnectionLogging -ProviderNamespace Microsoft.Network

To create a diagnostic setting and enable Resource Specific Table, see Create diagnostic settings in Azure Monitor.

Azure Monitor Logs tables

This section lists the Azure Monitor Logs tables relevant to this service, which are available for query by Log Analytics using Kusto queries. The tables contain resource log data and possibly more depending on what is collected and routed to them.

Azure Firewall Microsoft.Network/azureFirewalls

Activity log

The linked table lists the operations that may be recorded in the activity log for this service. This is a subset of all the possible resource provider operations in the activity log.

For more information on the schema of activity log entries, see Activity Log schema.