GCP

UNAVAILABLE

GCP UNAVAILABLE means the service is temporarily unable to handle the request; safe clients should retry with backoff.

Last reviewed: February 12, 2026|Editorial standard: source-backed technical guidance

What Does Unavailable Mean?

Service availability is temporarily degraded, so calls fail until backend capacity, dependency health, or network path stability recovers.

Common Causes

-A transient outage or overload affects the target service path.
-Upstream dependency is unavailable from the serving backend.
-Network path instability causes temporary call failures.
-Client retries without backoff amplify load during degradation.

How to Fix Unavailable

1Retry idempotent operations with exponential backoff and full jitter.
2Honor server retry hints and cap total retry duration.
3Align timeout and retry policies across client, gateway, and backend.
4Monitor health signals and fail over where multi-region options exist.

Step-by-Step Diagnosis for Unavailable

1Capture failure samples with endpoint, region, retry count, and latency per hop.
2Inspect DNS resolution, TLS handshakes, proxy behavior, and connection pool saturation.
3Differentiate transient service outage from client-caused retry amplification.
4Re-test with controlled concurrency and bounded retry policy to validate stabilization.

Transient Failure Surface Mapping

-Correlate unavailable errors with region and backend endpoint health (example: one zone shows elevated UNAVAILABLE while others remain healthy).
-Check dependency chain failures and upstream saturation signals (example: downstream datastore outage causes cascading unavailability).

Retry Safety and Traffic Control

-Apply idempotency-aware retry rules (example: read calls retry automatically, mutating calls require idempotency key).
-Throttle recovery traffic to avoid thundering herd effects (example: worker fleet resumes with randomized startup delays).

How to Verify the Fix

-Replay representative workloads and confirm UNAVAILABLE rate returns within SLO thresholds.
-Validate retries now converge without causing secondary latency or error spikes.
-Confirm multi-region or failover paths serve traffic successfully during simulated disruption.

How to Prevent Recurrence

-Standardize retry and timeout policy across clients, gateways, and background workers.
-Implement circuit breaking, backpressure, and graceful degradation for dependency outages.
-Continuously test failover and recovery playbooks with synthetic disruption drills.

Pro Tip

-cap maximum concurrent retries per caller cohort to prevent retry storms during partial outages.

Decision Support

Compare Guide

429 Too Many Requests vs 503 Service Unavailable

Use 429 for caller-specific throttling and 503 for service-wide outages, so retry behavior, escalation paths, and incident ownership stay correct.

Compare Guide

500 Internal Server Error vs 502 Bad Gateway: Root Cause

Debug 500 vs 502 faster: use 500 for origin failures and 502 for invalid upstream responses at gateways, then route incidents to the right team.

Playbook

Availability and Dependency Playbook (500 / 503 / ServiceUnavailable)

Use this playbook to separate origin-side 500 failures from temporary 503 dependency or capacity outages, then apply safe retry and escalation paths.

Playbook

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Use this playbook to separate invalid upstream responses from upstream wait expiration and deadline exhaustion, and apply timeout budgets, safe retries, and circuit-breaker controls safely.

Official References

Provider Context

This guidance is specific to GCP services. Always validate implementation details against official provider documentation before deploying to production.

GCP

UNAVAILABLE

GCP UNAVAILABLE means the service is temporarily unable to handle the request; safe clients should retry with backoff.

Last reviewed: February 12, 2026|Editorial standard: source-backed technical guidance

What Does Unavailable Mean?

Service availability is temporarily degraded, so calls fail until backend capacity, dependency health, or network path stability recovers.

Common Causes

-A transient outage or overload affects the target service path.
-Upstream dependency is unavailable from the serving backend.
-Network path instability causes temporary call failures.
-Client retries without backoff amplify load during degradation.

How to Fix Unavailable

1Retry idempotent operations with exponential backoff and full jitter.
2Honor server retry hints and cap total retry duration.
3Align timeout and retry policies across client, gateway, and backend.
4Monitor health signals and fail over where multi-region options exist.

Step-by-Step Diagnosis for Unavailable

1Capture failure samples with endpoint, region, retry count, and latency per hop.
2Inspect DNS resolution, TLS handshakes, proxy behavior, and connection pool saturation.
3Differentiate transient service outage from client-caused retry amplification.
4Re-test with controlled concurrency and bounded retry policy to validate stabilization.

Transient Failure Surface Mapping

-Correlate unavailable errors with region and backend endpoint health (example: one zone shows elevated UNAVAILABLE while others remain healthy).
-Check dependency chain failures and upstream saturation signals (example: downstream datastore outage causes cascading unavailability).

Retry Safety and Traffic Control

-Apply idempotency-aware retry rules (example: read calls retry automatically, mutating calls require idempotency key).
-Throttle recovery traffic to avoid thundering herd effects (example: worker fleet resumes with randomized startup delays).

How to Verify the Fix

-Replay representative workloads and confirm UNAVAILABLE rate returns within SLO thresholds.
-Validate retries now converge without causing secondary latency or error spikes.
-Confirm multi-region or failover paths serve traffic successfully during simulated disruption.

How to Prevent Recurrence

-Standardize retry and timeout policy across clients, gateways, and background workers.
-Implement circuit breaking, backpressure, and graceful degradation for dependency outages.
-Continuously test failover and recovery playbooks with synthetic disruption drills.

Pro Tip

-cap maximum concurrent retries per caller cohort to prevent retry storms during partial outages.

Decision Support

Compare Guide

Official References

Provider Context

This guidance is specific to GCP services. Always validate implementation details against official provider documentation before deploying to production.

What Does Unavailable Mean?

Common Causes

How to Fix Unavailable

Step-by-Step Diagnosis for Unavailable

Transient Failure Surface Mapping

Retry Safety and Traffic Control

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

429 Too Many Requests vs 503 Service Unavailable

500 Internal Server Error vs 502 Bad Gateway: Root Cause

Availability and Dependency Playbook (500 / 503 / ServiceUnavailable)

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Official References

Provider Context

What Does Unavailable Mean?

Common Causes

How to Fix Unavailable

Step-by-Step Diagnosis for Unavailable

Transient Failure Surface Mapping

Retry Safety and Traffic Control

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

429 Too Many Requests vs 503 Service Unavailable

500 Internal Server Error vs 502 Bad Gateway: Root Cause

Availability and Dependency Playbook (500 / 503 / ServiceUnavailable)

API Timeout Playbook (502 / 504 / DEADLINE_EXCEEDED)

Official References

Provider Context