AWS

Unavailable

AWS Unavailable (Unavailable) means the service is temporarily unavailable. In AWS APIs, this error returns HTTP 503.

Last reviewed: February 24, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Unavailable Mean?

When Unavailable is returned, the service cannot process requests temporarily, so user operations fail until endpoint health recovers and retry/failover logic stabilizes traffic.

Common Causes

-Service-side maintenance, incident, or transient control-plane disruption.
-Regional dependency degradation causing intermittent request failures.
-Burst traffic combined with aggressive retries amplifies transient instability.
-Client-side networking or DNS path issues mimic service unavailability.

How to Fix Unavailable

1Retry with bounded exponential backoff plus jitter and strict retry budgets.
2Reduce non-critical traffic while preserving critical workflows.
3Check AWS Health plus service-specific dashboards for ongoing incidents.
4Fail over to alternate region/path where architecture supports it.

Step-by-Step Diagnosis for Unavailable

1Map failures by region, endpoint, and operation to isolate blast radius.
2Correlate outages with deploy windows, traffic spikes, and network changes.
3Inspect retry telemetry to detect retry storms or missing backpressure.
4Validate client DNS/TLS/network health to separate service from transport failures.

Availability Checks

-Correlate failure windows with AWS Health events and service dashboards (example: temporary server-side failure periods produce clustered 503 responses).
-Inspect endpoint reachability, DNS resolution, and TLS handshake integrity from callers (example: transport-layer instability mimics service outages).

Resilience Path Validation

-Audit retry behavior for jitter and bounded budgets (example: synchronized retries create secondary waves during partial recovery).
-Verify failover and graceful-degradation paths for critical operations (example: fallback region or queue buffering keeps essential workflows alive).

Seen in Production

Control-plane endpoint intermittently fails during regional event

Frequency: common

Example: Automation pipelines receive sporadic 503-style unavailability during a short incident window.

Fix: Use bounded retries, queue non-critical jobs, and monitor regional recovery signals.

Internal DNS issue causes localized service reachability failures

Frequency: rare

Example: One subnet reports Unavailable while others succeed against the same service endpoint.

Fix: Validate DNS resolver path health and harden client endpoint fallback behavior.

Debugging Tools

-AWS Health Dashboard
-CloudWatch availability metrics
-Distributed trace and retry analytics
-Network and DNS diagnostics

How to Verify the Fix

-Confirm success rates and latency return to normal baselines.
-Verify retries normalize and no longer dominate request volume.
-Run controlled failover tests to validate recovery automation.

How to Prevent Recurrence

-Build multi-region resilience for critical paths and state replication where feasible.
-Apply circuit breakers, adaptive throttling, and backpressure across clients.
-Practice incident and failover drills with production-like traffic patterns.

Pro Tip

-define error-budget-aware retry caps so clients stop amplifying outages once service health drops below a threshold.

Official References

Provider Context

This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

Step-by-Step Diagnosis for Unavailable

1Map failures by region, endpoint, and operation to isolate blast radius.

2Correlate outages with deploy windows, traffic spikes, and network changes.

3Inspect retry telemetry to detect retry storms or missing backpressure.

4Validate client DNS/TLS/network health to separate service from transport failures.

Availability Checks

-Correlate failure windows with AWS Health events and service dashboards (example: temporary server-side failure periods produce clustered 503 responses).
-Inspect endpoint reachability, DNS resolution, and TLS handshake integrity from callers (example: transport-layer instability mimics service outages).

Resilience Path Validation

-Audit retry behavior for jitter and bounded budgets (example: synchronized retries create secondary waves during partial recovery).
-Verify failover and graceful-degradation paths for critical operations (example: fallback region or queue buffering keeps essential workflows alive).

Seen in Production

Control-plane endpoint intermittently fails during regional event

Frequency: common

Example: Automation pipelines receive sporadic 503-style unavailability during a short incident window.

Fix: Use bounded retries, queue non-critical jobs, and monitor regional recovery signals.

Internal DNS issue causes localized service reachability failures

Frequency: rare

Example: One subnet reports Unavailable while others succeed against the same service endpoint.

Fix: Validate DNS resolver path health and harden client endpoint fallback behavior.

Start Here

What Does Unavailable Mean?

Common Causes

How to Fix Unavailable

Step-by-Step Diagnosis for Unavailable

Availability Checks

Resilience Path Validation

Seen in Production

Control-plane endpoint intermittently fails during regional event

Internal DNS issue causes localized service reachability failures

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Official References

Provider Context

Start Here

What Does Unavailable Mean?

Common Causes

How to Fix Unavailable

Step-by-Step Diagnosis for Unavailable

Availability Checks

Resilience Path Validation

Seen in Production

Control-plane endpoint intermittently fails during regional event

Internal DNS issue causes localized service reachability failures

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Official References

Provider Context