GCP

DEADLINE_EXCEEDED

GCP DEADLINE_EXCEEDED means the client deadline expired before receiving a response, even if server-side work might have completed.

Last reviewed: April 15, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

Playbook: API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)Related: UNAVAILABLE Related: CANCELLED Related: INTERNAL

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Deadline Exceeded Mean?

The caller or an intermediary gave up waiting before the response arrived, but server-side work might still have completed or may still be running. That makes DEADLINE_EXCEEDED both a latency-budget problem and a duplicate-work risk when clients retry timed-out mutations carelessly.

Common Causes

-Client deadline is too short for real backend processing time.
-Queueing, cold starts, or dependency latency exceed timeout budget.
-Deadline is not propagated consistently, so inner hops keep working after outer callers already gave up.
-Network path variability delays responses beyond deadline.
-Retries stack onto already slow operations and exhaust budgets.
-Connection warm-up, DNS, or TLS setup consumes a large share of the deadline before useful work begins.

How to Fix Deadline Exceeded

1Capture end-to-end traces and identify which hop or queue phase burned most of the deadline.
2Align deadlines across caller, gateway, and backend so budgets are monotonic and intentional.
3Reduce tail latency in the slow hop before simply widening global deadlines.
4Do not assume DEADLINE_EXCEEDED is generically retryable; use idempotency or reconciliation controls before enabling bounded retries for explicitly safe paths.

Step-by-Step Diagnosis for Deadline Exceeded

1Capture end-to-end latency distribution, per-hop timing, configured deadline, and remaining deadline budget for failed calls.
2Compare configured deadlines with observed p95/p99 latency under representative load and during cold starts or dependency brownouts.
3Check whether timed-out mutating calls may have completed server-side after deadline expiry.
4Inspect retry fan-out, queueing, and connection warm-up effects that can consume most of the deadline before useful work starts.
5Adjust timeout budgets and retry policy, then retest with controlled load and trace sampling.

Seen in Production

-A service keeps a 1-second client deadline after a rollout, but a dependency path now takes 1.8 seconds at p99, so every burst starts returning DEADLINE_EXCEEDED.
-Cloud Run cold starts plus DNS and TLS warm-up consume most of the request budget before the application reaches the slow dependency.
-A mutating RPC times out client-side, then completes on the server moments later, and a blind retry creates duplicate effects.
-Internal retries inside one service consume the full budget, so the outer caller sees DEADLINE_EXCEEDED even though the final downstream hop barely ran.

Deadline Budget and Propagation Audit

-Align deadlines across client, gateway, and backend to prevent premature caller timeout (example: client deadline shorter than gateway upstream timeout).
-Trace remaining deadline budget at each hop to prove where most of the time is burned (example: connect, TLS, and queue wait leave only 200 ms for actual work).
-Separate connect, read, and total-request budgets where supported (example: short connect timeout with longer processing timeout for heavy methods).

Tail Latency and Queue Burn

-Inspect which span or queue phase consumes most of the deadline (example: pool starvation or cold-start work burns 80% of the budget before the DB call even begins).
-Correlate latency spikes with retry storms, dependency saturation, or autoscaling lag (example: internal retries create enough extra load to guarantee timeouts).

Post-Timeout Side-Effect Safety

-Treat timed-out mutations as potentially committed (example: retrying create without idempotency key may duplicate resource).
-Use request IDs or idempotency tokens to reconcile uncertain completion after timeout events.

Wrong Fix to Avoid

-Do not just raise one caller deadline globally if one downstream hop is still burning most of the budget.
-Do not treat DEADLINE_EXCEEDED as generically retryable; only enable bounded retries for explicitly safe paths, and never blindly retry timed-out writes without idempotency or reconciliation evidence.
-Do not treat DEADLINE_EXCEEDED as generic network noise when traces already show one hot span or queue path consuming the entire budget.

Implementation Examples

Seen in production: caller deadline expires while work is still in flightjson

{
  "requestId": "req_7cb203",
  "deadlineMs": 1000,
  "remainingDeadlineMs": 0,
  "slowSpan": "inventory.Lookup",
  "latencyMs": 1187,
  "status": "DEADLINE_EXCEEDED"
}

Reproduce one deadline boundary with grpcurlbash

grpcurl -max-time 1 \
  -d '{"sku":"A-102"}' \
  api.example.internal:443 inventory.InventoryService/GetItem

Trace evidence: timeout fired before server completion was reconciledjson

{
  "requestId": "req_write_19a",
  "clientStatus": "DEADLINE_EXCEEDED",
  "serverCommitted": true,
  "commitTs": "2026-04-15T14:13:07Z",
  "retryAttempt": 1
}

Incident Timeline

14:09 UTC

Latency begins to rise before callers actually time out

Signal: Cold starts, queue wait, internal retries, or one dependency span begins burning a larger share of each request budget.

Why it matters: The earliest useful signal is deadline burn, not the final error code.

14:12 UTC

The caller gives up before the response can return

Signal: Client or intermediary budget reaches zero while work is still in flight, and DEADLINE_EXCEEDED starts appearing in logs and traces.

Why it matters: At this point the system has both a latency problem and a completion-ambiguity problem for timed-out mutations.

14:13 UTC

Blind retries can stack onto still-running work

Signal: Clients immediately retry timed-out operations while original attempts may still complete server-side moments later.

Why it matters: Without idempotency or reconciliation, the timeout can turn into duplicate work rather than clean recovery.

14:21 UTC

Budget alignment plus tail-latency reduction clears the incident

Signal: The slow span is reduced, internal retries are bounded, and the same path completes comfortably inside the revised deadline ladder.

Why it matters: That confirms the real fix lived in budget design and tail latency, not in broad timeout inflation.

Seen in Production

Latency spike after rollout pushes p99 beyond static client deadline

Frequency: common

Example: Read calls that usually finish in 400ms now take 2s and exceed 1s deadline.

Fix: Tune deadline to realistic percentile and optimize slow downstream dependency path.

Cold starts and connection warm-up consume most of the budget

Frequency: common

Example: The service spends much of a 2-second deadline on startup, DNS, and TLS before it reaches the dependency that is already marginally slow.

Fix: Reduce warm-up overhead, pre-establish connections where safe, and give realistic deadlines to cold paths.

Timed-out mutation completes server-side and retry creates duplicate effects

Frequency: medium

Example: Client retries write after timeout, but original request committed moments later.

Fix: Add idempotency token and reconciliation check before retrying timed-out mutations.

Internal retries consume the full outer deadline

Frequency: medium

Example: One service retries a slow dependency three times, leaving no remaining budget for the caller-facing response.

Fix: Bound internal retries and propagate one shared deadline budget across the call chain.

Wrong Fix vs Better Fix

Raise deadlines blindly vs isolate the slow hop

Wrong fix: Increase caller deadlines everywhere because some requests almost finish.

Better fix: Identify which hop, queue, or cold-start phase is consuming most of the budget, then reduce that latency or redesign the path.

Why this is better: Longer deadlines can hide the symptom briefly, but they do not fix a hot span or queue path that is inherently too slow.

Retry timed-out writes vs reconcile first

Wrong fix: Automatically retry timed-out mutations with no idempotency key or completion check.

Better fix: Treat timed-out writes as potentially committed, add idempotency or request IDs, and reconcile before retrying.

Why this is better: DEADLINE_EXCEEDED does not prove the server did nothing. Safe recovery requires completion awareness.

Tune one boundary only vs propagate one budget end to end

Wrong fix: Change one client timeout and assume the rest of the call chain will line up automatically.

Better fix: Make deadlines monotonic across callers, gateways, and backends, and prove remaining budget at each hop.

Why this is better: Mismatched budgets simply move the timeout boundary around the stack instead of resolving it.

Debugging Tools

-Distributed tracing with hop-level latency
-Cloud Monitoring percentile latency charts
-Client timeout and retry telemetry
-Remaining-deadline annotations in logs or traces
-Idempotency/reconciliation audit logs

How to Verify the Fix

-Re-run critical request paths and confirm deadline-expired rates stay below SLO thresholds.
-Validate operations complete within updated budgets at normal, peak, and cold-start traffic windows.
-Confirm no duplicate side effects occur when retries follow timeout conditions.
-Verify traces now show healthy remaining deadline budget at the previously slow hop.

How to Prevent Recurrence

-Set method-specific deadlines from observed latency percentiles plus safety margins.
-Continuously monitor queueing, cold-start, and warm-up contributions to timeout risk.
-Enforce idempotency keys or reconciliation IDs on mutating operations that might be retried after timeouts.
-Expose remaining-deadline metrics or trace annotations at major service boundaries.

Pro Tip

-emit a dedicated metric for timed_out_but_committed reconciliations to detect hidden duplicate-work risk.

Decision Support

Playbook

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Use this playbook to separate invalid upstream responses from upstream wait expiration and deadline exhaustion, and apply timeout budgets, safe retries, and circuit-breaker controls safely.

Official References

Provider Context

This guidance is specific to GCP services. Always validate implementation details against official provider documentation before deploying to production.

GCP

DEADLINE_EXCEEDED

GCP DEADLINE_EXCEEDED means the client deadline expired before receiving a response, even if server-side work might have completed.

Last reviewed: April 15, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

Playbook: API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)Related: UNAVAILABLE Related: CANCELLED Related: INTERNAL

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Deadline Exceeded Mean?

Common Causes

-Client deadline is too short for real backend processing time.
-Queueing, cold starts, or dependency latency exceed timeout budget.
-Deadline is not propagated consistently, so inner hops keep working after outer callers already gave up.
-Network path variability delays responses beyond deadline.
-Retries stack onto already slow operations and exhaust budgets.
-Connection warm-up, DNS, or TLS setup consumes a large share of the deadline before useful work begins.

How to Fix Deadline Exceeded

1Capture end-to-end traces and identify which hop or queue phase burned most of the deadline.
2Align deadlines across caller, gateway, and backend so budgets are monotonic and intentional.
3Reduce tail latency in the slow hop before simply widening global deadlines.
4Do not assume DEADLINE_EXCEEDED is generically retryable; use idempotency or reconciliation controls before enabling bounded retries for explicitly safe paths.

Step-by-Step Diagnosis for Deadline Exceeded

1Capture end-to-end latency distribution, per-hop timing, configured deadline, and remaining deadline budget for failed calls.
2Compare configured deadlines with observed p95/p99 latency under representative load and during cold starts or dependency brownouts.
3Check whether timed-out mutating calls may have completed server-side after deadline expiry.
4Inspect retry fan-out, queueing, and connection warm-up effects that can consume most of the deadline before useful work starts.
5Adjust timeout budgets and retry policy, then retest with controlled load and trace sampling.

Seen in Production

-A service keeps a 1-second client deadline after a rollout, but a dependency path now takes 1.8 seconds at p99, so every burst starts returning DEADLINE_EXCEEDED.
-Cloud Run cold starts plus DNS and TLS warm-up consume most of the request budget before the application reaches the slow dependency.
-A mutating RPC times out client-side, then completes on the server moments later, and a blind retry creates duplicate effects.
-Internal retries inside one service consume the full budget, so the outer caller sees DEADLINE_EXCEEDED even though the final downstream hop barely ran.

Deadline Budget and Propagation Audit

-Align deadlines across client, gateway, and backend to prevent premature caller timeout (example: client deadline shorter than gateway upstream timeout).
-Trace remaining deadline budget at each hop to prove where most of the time is burned (example: connect, TLS, and queue wait leave only 200 ms for actual work).
-Separate connect, read, and total-request budgets where supported (example: short connect timeout with longer processing timeout for heavy methods).

Tail Latency and Queue Burn

-Inspect which span or queue phase consumes most of the deadline (example: pool starvation or cold-start work burns 80% of the budget before the DB call even begins).
-Correlate latency spikes with retry storms, dependency saturation, or autoscaling lag (example: internal retries create enough extra load to guarantee timeouts).

Post-Timeout Side-Effect Safety

-Treat timed-out mutations as potentially committed (example: retrying create without idempotency key may duplicate resource).
-Use request IDs or idempotency tokens to reconcile uncertain completion after timeout events.

Wrong Fix to Avoid

-Do not just raise one caller deadline globally if one downstream hop is still burning most of the budget.
-Do not treat DEADLINE_EXCEEDED as generically retryable; only enable bounded retries for explicitly safe paths, and never blindly retry timed-out writes without idempotency or reconciliation evidence.
-Do not treat DEADLINE_EXCEEDED as generic network noise when traces already show one hot span or queue path consuming the entire budget.

Implementation Examples

Seen in production: caller deadline expires while work is still in flightjson

{
  "requestId": "req_7cb203",
  "deadlineMs": 1000,
  "remainingDeadlineMs": 0,
  "slowSpan": "inventory.Lookup",
  "latencyMs": 1187,
  "status": "DEADLINE_EXCEEDED"
}

Reproduce one deadline boundary with grpcurlbash

grpcurl -max-time 1 \
  -d '{"sku":"A-102"}' \
  api.example.internal:443 inventory.InventoryService/GetItem

Trace evidence: timeout fired before server completion was reconciledjson

{
  "requestId": "req_write_19a",
  "clientStatus": "DEADLINE_EXCEEDED",
  "serverCommitted": true,
  "commitTs": "2026-04-15T14:13:07Z",
  "retryAttempt": 1
}

Incident Timeline

14:09 UTC

Latency begins to rise before callers actually time out

Signal: Cold starts, queue wait, internal retries, or one dependency span begins burning a larger share of each request budget.

Why it matters: The earliest useful signal is deadline burn, not the final error code.

14:12 UTC

The caller gives up before the response can return

Signal: Client or intermediary budget reaches zero while work is still in flight, and DEADLINE_EXCEEDED starts appearing in logs and traces.

Why it matters: At this point the system has both a latency problem and a completion-ambiguity problem for timed-out mutations.

14:13 UTC

Blind retries can stack onto still-running work

Signal: Clients immediately retry timed-out operations while original attempts may still complete server-side moments later.

Why it matters: Without idempotency or reconciliation, the timeout can turn into duplicate work rather than clean recovery.

14:21 UTC

Budget alignment plus tail-latency reduction clears the incident

Signal: The slow span is reduced, internal retries are bounded, and the same path completes comfortably inside the revised deadline ladder.

Why it matters: That confirms the real fix lived in budget design and tail latency, not in broad timeout inflation.

Seen in Production

Latency spike after rollout pushes p99 beyond static client deadline

Frequency: common

Example: Read calls that usually finish in 400ms now take 2s and exceed 1s deadline.

Fix: Tune deadline to realistic percentile and optimize slow downstream dependency path.

Cold starts and connection warm-up consume most of the budget

Frequency: common

Example: The service spends much of a 2-second deadline on startup, DNS, and TLS before it reaches the dependency that is already marginally slow.

Fix: Reduce warm-up overhead, pre-establish connections where safe, and give realistic deadlines to cold paths.

Timed-out mutation completes server-side and retry creates duplicate effects

Frequency: medium

Example: Client retries write after timeout, but original request committed moments later.

Fix: Add idempotency token and reconciliation check before retrying timed-out mutations.

Internal retries consume the full outer deadline

Frequency: medium

Example: One service retries a slow dependency three times, leaving no remaining budget for the caller-facing response.

Fix: Bound internal retries and propagate one shared deadline budget across the call chain.

Wrong Fix vs Better Fix

Raise deadlines blindly vs isolate the slow hop

Wrong fix: Increase caller deadlines everywhere because some requests almost finish.

Better fix: Identify which hop, queue, or cold-start phase is consuming most of the budget, then reduce that latency or redesign the path.

Why this is better: Longer deadlines can hide the symptom briefly, but they do not fix a hot span or queue path that is inherently too slow.

Retry timed-out writes vs reconcile first

Wrong fix: Automatically retry timed-out mutations with no idempotency key or completion check.

Better fix: Treat timed-out writes as potentially committed, add idempotency or request IDs, and reconcile before retrying.

Why this is better: DEADLINE_EXCEEDED does not prove the server did nothing. Safe recovery requires completion awareness.

Tune one boundary only vs propagate one budget end to end

Wrong fix: Change one client timeout and assume the rest of the call chain will line up automatically.

Better fix: Make deadlines monotonic across callers, gateways, and backends, and prove remaining budget at each hop.

Why this is better: Mismatched budgets simply move the timeout boundary around the stack instead of resolving it.

Debugging Tools

-Distributed tracing with hop-level latency
-Cloud Monitoring percentile latency charts
-Client timeout and retry telemetry
-Remaining-deadline annotations in logs or traces
-Idempotency/reconciliation audit logs

How to Verify the Fix

-Re-run critical request paths and confirm deadline-expired rates stay below SLO thresholds.
-Validate operations complete within updated budgets at normal, peak, and cold-start traffic windows.
-Confirm no duplicate side effects occur when retries follow timeout conditions.
-Verify traces now show healthy remaining deadline budget at the previously slow hop.

How to Prevent Recurrence

-Set method-specific deadlines from observed latency percentiles plus safety margins.
-Continuously monitor queueing, cold-start, and warm-up contributions to timeout risk.
-Enforce idempotency keys or reconciliation IDs on mutating operations that might be retried after timeouts.
-Expose remaining-deadline metrics or trace annotations at major service boundaries.

Pro Tip

-emit a dedicated metric for timed_out_but_committed reconciliations to detect hidden duplicate-work risk.

Decision Support

Playbook

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Use this playbook to separate invalid upstream responses from upstream wait expiration and deadline exhaustion, and apply timeout budgets, safe retries, and circuit-breaker controls safely.

Official References

Provider Context

This guidance is specific to GCP services. Always validate implementation details against official provider documentation before deploying to production.

Start Here

What Does Deadline Exceeded Mean?

Common Causes

How to Fix Deadline Exceeded

Step-by-Step Diagnosis for Deadline Exceeded

Seen in Production

Deadline Budget and Propagation Audit

Tail Latency and Queue Burn

Post-Timeout Side-Effect Safety

Wrong Fix to Avoid

Implementation Examples

Incident Timeline

Latency begins to rise before callers actually time out

The caller gives up before the response can return

Blind retries can stack onto still-running work

Budget alignment plus tail-latency reduction clears the incident

Seen in Production

Latency spike after rollout pushes p99 beyond static client deadline

Cold starts and connection warm-up consume most of the budget

Timed-out mutation completes server-side and retry creates duplicate effects

Internal retries consume the full outer deadline

Wrong Fix vs Better Fix

Raise deadlines blindly vs isolate the slow hop

Retry timed-out writes vs reconcile first

Tune one boundary only vs propagate one budget end to end

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Official References

Provider Context

Start Here

What Does Deadline Exceeded Mean?

Common Causes

How to Fix Deadline Exceeded

Step-by-Step Diagnosis for Deadline Exceeded

Seen in Production

Deadline Budget and Propagation Audit

Tail Latency and Queue Burn

Post-Timeout Side-Effect Safety

Wrong Fix to Avoid

Implementation Examples

Incident Timeline

Latency begins to rise before callers actually time out

The caller gives up before the response can return

Blind retries can stack onto still-running work

Budget alignment plus tail-latency reduction clears the incident

Seen in Production

Latency spike after rollout pushes p99 beyond static client deadline

Cold starts and connection warm-up consume most of the budget

Timed-out mutation completes server-side and retry creates duplicate effects

Internal retries consume the full outer deadline

Wrong Fix vs Better Fix

Raise deadlines blindly vs isolate the slow hop

Retry timed-out writes vs reconcile first

Tune one boundary only vs propagate one budget end to end

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Official References

Provider Context