Keycloak Behind Azure WAF: Health Probe Failure Report
A step-by-step diagnostic guide for publishing Keycloak behind Azure Application Gateway WAF, with special focus on why health probes fail even when WAF successfully routes traffic to another web app.
Executive Summary
If the same Azure WAF/Application Gateway works with another web app, the strongest starting assumption is that the WAF listener and public frontend are broadly functional. The Keycloak-specific failure is more likely in the Application Gateway backend setting, custom probe, host header/SNI, certificate trust chain, Keycloak proxy/hostname configuration, or the choice of Keycloak health endpoint. Azure Application Gateway health probes are backend reachability checks; they are not proof that WAF policy inspection is working or failing.
Visual Analytics
Likelihood ranking for a Keycloak-specific probe failure when Azure WAF already works against a different backend.
92% likelihood. Keycloak health defaults to management port 9000, while user traffic normally targets 8080 or 8443.
88% likelihood with HTTPS backends. Application Gateway v2 custom probe host is used as host header and SNI.
82% likelihood. Keycloak requires correct hostname and proxy header configuration behind reverse proxies.
76% likelihood. Keycloak health endpoints require health-enabled=true and may live on management or main interface depending configuration.
70% likelihood if probing 9000 or a private backend subnet. The working web app lowers but does not eliminate this risk.
58% likelihood for health probes themselves. WAF policy can affect client requests, but backend health probes are primarily Application Gateway backend checks.
Root Cause Matrix
Use this table to narrow the fault domain before changing WAF rules.
| Cause | Why it breaks Keycloak probes | Validation | Fix | Confidence |
|---|---|---|---|---|
| Default probe hits root path | Application Gateway default probes call a root path on the backend setting protocol and port. Keycloak readiness is not guaranteed at /, and health endpoints may be on port 9000. | Check Backend Health details for status mismatch, 404, 403, 503, or timeout. Curl the exact probe URL from a VM in the VNet. | Create a custom probe to /health/ready on the correct port/protocol. | 95% Direct Azure and Keycloak docs. |
| Health endpoint not enabled | Keycloak health endpoints are disabled unless built with --health-enabled=true. | Run curl --head -fsS http://localhost:9000/health/ready from the Keycloak host/network. | Build/start Keycloak with health enabled; enable metrics if checks require it. | 93% Keycloak documents default as false. |
| Management port unreachable | Keycloak exposes health on management port 9000 by default; Application Gateway may target 8080/8443 only, or NSG may block 9000. | Test connectivity from the Application Gateway subnet or a peered diagnostic VM. | Use custom probe port 9000 only where private network access permits it. Do not expose 9000 publicly. | 91% Validated by Keycloak docs. |
| Host/SNI mismatch | For HTTPS probes, Application Gateway v2 uses the probe host as both Host header and SNI. If it does not match the backend certificate, health fails. | Backend Health often reports CN/SAN mismatch or certificate verification errors. Verify with OpenSSL and SNI. | Set backend host override or custom probe host to a cert-covered name; upload trusted root for private CA. | 90% Direct Azure documentation. |
| Proxy headers missing | Keycloak behind edge or re-encrypting proxy should use --proxy-headers forwarded or xforwarded. Missing proxy settings can cause 403 origin checks through proxy paths. | Compare direct backend curl with proxied request. Use hostname debug temporarily if safe. | Configure --hostname, --proxy-headers, and trusted proxy addresses. | 87% Direct Keycloak reverse proxy warning. |
| Wrong context path | If Keycloak is served at /auth or another path, a probe to /health/ready or / may miss the actual endpoint path. | Check http-relative-path and http-management-relative-path. | Align Application Gateway path rewrite, Keycloak hostname URL, forwarded prefix, or relative path settings. | 84% Depends on deployment flags. |
Step-by-Step Troubleshooting Workflow
- Confirm the failure is backend health, not WAF frontend routing. In Application Gateway, open Backend Health for the Keycloak pool and capture the exact health message.
- Compare working web app settings to Keycloak settings. Compare listener, rule, backend pool, backend setting, probe, protocol, port, host override, trusted root certificate, and timeout.
- Stop using the default probe for Keycloak. The default probe is usually too generic for Keycloak. Use an explicit custom readiness probe.
- Validate Keycloak health locally. From the Keycloak host, pod, or same-subnet diagnostic VM, test
http://<keycloak>:9000/health/ready. Healthy should return HTTP 200; unhealthy returns 503. - Validate Application Gateway can reach the same endpoint. Confirm NSG, UDR, firewall, and ingress allow traffic from the Application Gateway subnet to the selected Keycloak probe port.
- Fix HTTPS/SNI before changing WAF rules. If using HTTPS backend settings, make the probe host and backend host match the certificate SAN. Upload private root CA certificates if needed.
- Configure Keycloak for reverse proxy reality. For edge TLS termination, Keycloak should generally run with HTTP enabled internally, fixed external hostname, and proxy headers enabled.
- Keep health private. Keycloak recommends not exposing
/healthor port9000publicly. Use private probe access only. - Re-test full identity flow after backend health turns green. Validate OIDC discovery, login redirect, token endpoint, static resources, and admin exposure separately.
Command Checklist
Replace placeholders before running. These are diagnostic examples, not deployment instructions.
az network application-gateway show-backend-health --resource-group <rg> --name <app-gateway-name>
az network application-gateway show --resource-group <rg> --name <app-gateway-name> --query "{probes:probes, backendHttpSettings:backendHttpSettingsCollection}"
Test-NetConnection -ComputerName <keycloak-private-fqdn-or-ip> -Port 9000
curl --head -fsS http://<keycloak-private-host>:9000/health/ready
openssl s_client -connect <keycloak-host>:8443 -servername <expected-sni-host> -showcertsRecommended Target Configuration
Application Gateway Probe
Custom probe, path /health/ready, expected status 200-399, explicit host/SNI when HTTPS, and custom port 9000 only if using v2 and the network allows it.
Keycloak Runtime
Build with --health-enabled=true. Use --hostname https://<public-host>. For edge termination, enable internal HTTP and configure --proxy-headers xforwarded or forwarded.
Security Boundary
Do not publicly expose Keycloak management port 9000, health, metrics, or admin paths unless explicitly required and protected. Allow probe traffic only from trusted internal paths.
Verdict
The most likely explanation is not that Azure WAF is broken. The fact that WAF works against another web app strongly suggests a Keycloak-specific backend health configuration issue: wrong probe path/port, missing health enablement, host/SNI mismatch, certificate trust issue, proxy header configuration, or relative path mismatch.
Evidence, Interpretation, and Limitations
| Item | Evidence | Interpretation | Limitation | Confidence |
|---|---|---|---|---|
| Application Gateway probes | Microsoft documents default/custom probe behavior, status-code matching, path, host, port, timeout, and unhealthy threshold. | Probe configuration can fail independently even when the frontend listener and WAF route another app correctly. | Actual gateway settings were not provided. | 96% Primary vendor documentation. |
| Keycloak health | Keycloak documents health endpoints, HTTP 200/503 behavior, health-enabled, and default management port 9000. | Application Gateway must probe the endpoint Keycloak actually exposes, not a generic web root. | Keycloak version and startup flags were not provided. | 94% Primary product documentation. |
| WAF comparison | Operational observation: WAF appears to work with another web app. | This narrows, but does not fully prove, the issue is Keycloak backend-specific rather than global WAF/frontend failure. | A shared listener, rule, or policy difference could still exist. | 78% User-provided environment signal. |
Sources
References used to validate the report. Vendor documentation receives the highest evidentiary weight.
- Microsoft Learn: Application Gateway health probes overview - default probe URL, status-code range, custom probe settings, host/SNI behavior, NSG considerations.
- Microsoft Learn: Troubleshoot backend health issues in Application Gateway - unhealthy/unknown states, status mismatch, DNS, TCP, TLS, certificate, and root certificate troubleshooting.
- Microsoft Learn: Troubleshoot 502 errors in Application Gateway - 502 causes including NSG/UDR/DNS, default probe failures, unhealthy backend pools, and upstream SSL mismatch.
- Keycloak Docs: Tracking instance status with health checks - health endpoints,
health-enabled, 200/503 responses, default management port usage. - Keycloak Docs: Configuring a reverse proxy - proxy ports, management port 9000, proxy headers, context paths, exposed path recommendations, sticky sessions.
- Keycloak Docs: Configuring the hostname v2 - hostname requirement, reverse proxy options, edge TLS termination, proxy headers, hostname debug.
- Keycloak Docs: Configuring the Management Interface - management port, relative path, TLS behavior, and management health settings.