502 - Bad Gateway
HTTP 502 Bad Gateway means a gateway or proxy received an invalid response from an upstream server.
Last reviewed: April 15, 2026|Source-backed guidance under our editorial policy
Start Here
Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.
This page is part of the Error Reference library. Learn more about the project or report a correction.
What Does Bad Gateway Mean?
The gateway reached an upstream target, but the upstream response could not be parsed, trusted, or completed cleanly. That makes 502 primarily an edge-to-origin integrity problem, not a generic application crash.
Common Causes
- -Reverse proxy points to an outdated upstream port or protocol after rollout, yielding malformed or empty responses from the target.
- -Gateway TLS handshake with the target fails because SNI, cert chain, or trust policy does not match backend configuration.
- -Upstream service emits invalid status or header formatting and the gateway rejects the response as malformed.
- -One shard resets or closes the upstream connection mid-response, so the proxy returns 502 even though routing succeeded.
- -Service discovery or DNS drift sends traffic to the wrong upstream that speaks a different protocol.
How to Fix Bad Gateway
- 1Identify which upstream pool member is emitting invalid responses and drain unhealthy targets immediately.
- 2Validate upstream protocol correctness, header integrity, and TLS chain compatibility with gateway expectations.
- 3Retest through the same edge path after fixing upstream response integrity and connection settings.
- 4Rollback the upstream or proxy config that introduced protocol drift if failures began right after deploy.
Step-by-Step Diagnosis for Bad Gateway
- 1Capture gateway error logs with upstream endpoint, connection ID, and parse or TLS failure reason for 502 events.
- 2Compare successful versus failing upstream responses at raw status line, header, and framing level.
- 3Inspect TLS handshake, SNI, and certificate trust between gateway and upstream services.
- 4Check whether routing, service discovery, or protocol upgrade settings changed near the first 502 spike.
- 5Retest with canary upstream routing to confirm failure is isolated to specific upstream instances or config paths.
Seen in Production
- -Gateway starts returning 502 right after a backend release because one shard emits an invalid header value and the edge parser rejects every response from that shard.
- -Ingress points to an HTTPS backend over a plain HTTP upstream port after a config change, so the gateway receives protocol gibberish and surfaces 502.
- -Certificate chain drift on one regional upstream pool makes TLS validation fail only for a subset of requests, creating intermittent 502s.
Protocol and Header Integrity Inspection
- -Inspect malformed upstream payload framing (example: invalid chunk termination or truncated header block causes proxy parse failure).
- -Verify upstream status and header semantics are RFC-compliant (example: illegal header bytes returned after a reverse-proxy upgrade).
Gateway TLS and Routing Audit
- -Trace TLS negotiation details and trust-store alignment (example: rotated intermediate cert is not trusted by edge gateway nodes).
- -Validate routing and service discovery correctness (example: gateway resolves to a deprecated upstream port speaking the wrong protocol).
Decision Shortcut: Invalid Response vs Unavailable Upstream
- -If the gateway connects and immediately reports parse, protocol, or TLS validation errors, stay in the 502 branch before investigating capacity or timeout tuning.
- -If the upstream is healthy in direct checks but only fails through one edge tier, prioritize proxy config, trust store, and protocol translation over application code.
Wrong Fix to Avoid
- -Do not only scale the gateway if the upstream response is malformed or the TLS handshake is failing; more edge capacity will not fix protocol drift.
- -Do not bucket every 502 under generic server errors when the real issue lives on the gateway-to-upstream boundary.
Implementation Examples
2026/04/13 10:17:42 [error] 31#31: *884 upstream sent invalid header:
"content-length: NaN" while reading response header from upstream,
client: 203.0.113.10, request: "GET /v1/report HTTP/1.1",
upstream: "http://10.0.3.24:8080/report", host: "api.example.com"curl -vk https://api.example.com/v1/report
openssl s_client -connect upstream.internal:8443 -servername upstream.internal </dev/nulllocation /api/ {
proxy_pass https://upstream_api_pool;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_ssl_server_name on;
}Incident Timeline
10:16 UTC
A release or cert change introduces upstream incompatibility
Signal: An upstream shard, cert chain, or route mapping changes shortly before the first 502 spike appears at the gateway.
Why it matters: That timing strongly favors edge-to-origin integrity drift over a generic application crash hypothesis.
10:17 UTC
The gateway connects but rejects what comes back
Signal: Error logs show invalid header bytes, malformed framing, TLS trust failure, or protocol gibberish from one upstream target.
Why it matters: A successful TCP connection plus a parser or trust failure is classic 502 territory. This is not yet a timeout or overload problem.
10:20 UTC
Scaling the edge does not reduce the error rate
Signal: More proxy instances are added, but every request routed to the bad upstream still fails with the same parser or handshake reason.
Why it matters: That confirms the defect lives at the boundary with one origin path, not in edge capacity.
10:27 UTC
Draining the bad upstream and restoring protocol parity clears the incident
Signal: Traffic is shifted away from the misconfigured target or the TLS/protocol mismatch is fixed, and 502 drops without changing application handlers.
Why it matters: The win condition is a clean gateway-to-origin contract, not just fewer visible errors at the edge.
Seen in Production
Upstream release returns malformed headers behind gateway
Frequency: common
Example: New backend build emits an invalid header value, and edge proxy rejects response with 502 for all requests to that shard.
Fix: Rollback malformed release, add header-schema checks in CI, and canary-test through real gateway parser before global rollout.
Protocol mismatch after proxy or service discovery change
Frequency: common
Example: Gateway starts speaking HTTP to a backend that now expects HTTPS after an upstream pool update.
Fix: Align upstream protocol configuration and add smoke tests that verify the exact gateway-to-origin scheme and port.
Certificate chain drift on one upstream cluster
Frequency: rare
Example: Gateway fails TLS validation only for one regional upstream group after cert rotation and returns intermittent 502.
Fix: Repair cert chain/trust configuration and enforce pre-rotation TLS validation tests from edge nodes.
Service mesh or ingress injects the wrong upstream protocol
Frequency: medium
Example: Sidecar upgrade changes one route from HTTP to h2c or HTTPS expectations, and the proxy now speaks the wrong protocol to the origin.
Fix: Pin the intended upstream protocol per route and validate edge-to-origin handshakes in canary before full rollout.
Wrong Fix vs Better Fix
Scale the gateway vs isolate the bad upstream
Wrong fix: Add more edge capacity because users are seeing a lot of 502s.
Better fix: Identify the upstream member or route emitting invalid responses and drain or repair that exact target first.
Why this is better: More gateway replicas do not fix malformed headers, broken TLS trust, or protocol mismatch from the same origin.
Raise timeouts vs inspect response integrity
Wrong fix: Treat the incident like a timeout and increase gateway read timeouts immediately.
Better fix: Inspect raw header, framing, and TLS negotiation evidence first to prove whether the upstream response is valid at all.
Why this is better: 502 means the response came back in a form the gateway could not trust or parse. Longer waits do not help if the bytes are still wrong.
Debug app handlers first vs debug the edge-origin contract
Wrong fix: Start from application business logic before checking what the gateway actually received from upstream.
Better fix: Use gateway logs, TLS diagnostics, and direct origin probes to verify protocol, trust, and framing on the failing path first.
Why this is better: The fastest 502 fixes usually come from the gateway boundary, not from rewriting handler code that never had a chance to run cleanly through the edge.
Debugging Tools
- -Gateway upstream error logs (parse/TLS/connect failures)
- -TLS handshake diagnostics (
openssl s_client, gateway TLS logs) - -Packet capture on gateway-upstream hop
- -Canary probes per upstream pool member
How to Verify the Fix
- -Re-run failing requests and confirm gateway receives parseable, valid upstream responses.
- -Validate error rate by upstream target remains stable during normal and peak traffic.
- -Confirm synthetic gateway probes to each upstream node stay green over sustained intervals.
- -Check that raw header captures and TLS handshakes stay clean after the rollback or config fix.
How to Prevent Recurrence
- -Gate upstream deployments with protocol-conformance tests and edge compatibility checks.
- -Continuously probe DNS, TLS, and upstream health paths with synthetic checks.
- -Use outlier detection and automatic ejection for upstream nodes returning invalid responses.
Pro Tip
- -capture sampled raw upstream response headers at the gateway for 502s so parser failures can be diagnosed without reproducing live incidents.
Decision Support
Compare Guide
500 Internal Server Error vs 502 Bad Gateway: Root Cause
Debug 500 vs 502 faster: use 500 for origin failures and 502 for invalid upstream responses at gateways, then route incidents to the right team.
Compare Guide
502 Bad Gateway vs 504 Gateway Timeout: Key Differences
Fix upstream errors faster: use 502 when a gateway gets an invalid upstream response, and 504 when the upstream service exceeds your timeout budget.
Playbook
API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)
Use this playbook to separate invalid upstream responses from upstream wait expiration and deadline exhaustion, and apply timeout budgets, safe retries, and circuit-breaker controls safely.
Official References
Provider Context
This guidance is specific to HTTP services. Always validate implementation details against official provider documentation before deploying to production.