429 Too Many Requests vs 503 Service Unavailable

Q: If traffic is high, should I always return 503?

No. Use 429 when a caller cohort exceeds policy limits. Use 503 when the service is broadly unavailable independent of one caller identity.

Q: Should both 429 and 503 include `Retry-After`?

They can. Include `Retry-After` when you can provide a meaningful retry window, and ensure clients treat it as an upper-bound pacing signal.

Q: Are automatic retries safe after 429 or 503?

Only with bounded, jittered retries and idempotent operations. Unbounded retries on either status can amplify load and prolong incidents.

Use 429 for caller-specific throttling and 503 for service-wide outages, so retry behavior, escalation paths, and incident ownership stay correct.

Last reviewed: March 3, 2026|Editorial standard: source-backed comparison guidance

Quick Decision

-Return 429 when a specific client, tenant, token, or route exceeds enforced request-rate/concurrency policy.
-Return 503 when the service is broadly unable to handle requests due to overload, maintenance, or dependency instability.
-`Retry-After` can appear on both, but 429 implies caller-level pressure while 503 implies service-level degradation.

Key Differences

-429 is primarily an admission-control/rate-governance signal, often scoped by identity or quota dimension.
-503 is an availability signal indicating temporary inability to serve requests at platform/service level.
-429 remediation focuses on pacing, backoff, and quota policy; 503 remediation focuses on capacity recovery and dependency health.
-Misclassifying 503 as 429 can hide platform incidents, while misclassifying 429 as 503 can hide abusive/bursty clients.

When To Use 429 Too Many Requests

-Rate limiter or quota guard rejects a specific principal, API key, tenant, or route budget.
-Unaffected callers continue succeeding while one cohort breaches policy window.
-Service is otherwise healthy and intentionally enforcing fairness/protection controls.

When To Use 503 Service Unavailable

-Service cannot process requests broadly because of maintenance, saturation, or cascading dependency failure.
-Availability degradation affects many callers, not just one policy bucket.
-Readiness/health probes and system telemetry indicate temporary service instability.

Step-by-Step Diagnosis

1Segment errors by principal/tenant/route to verify whether failures are isolated to specific throttling buckets.
2Check global service health signals (CPU, queue depth, saturation, upstream dependency state) for broad unavailability.
3Inspect limiter decisions and quota counters around incident window to confirm intentional throttling behavior.
4Validate `Retry-After` values and client retry compliance to avoid synchronized retry amplification.

Common Misclassifications

-Returning 503 for single-tenant/requester overuse, which hides quota/rate policy root cause.
-Returning 429 during true platform outages, which misleads clients into self-throttling instead of outage handling.
-Ignoring `Retry-After` guidance and causing synchronized retries that worsen outage conditions.

Response Examples

429 for principal-scoped throttlinghttp

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60

{
  "error": "too_many_requests",
  "message": "Rate limit exceeded for api_key_123.",
  "limit_scope": "api_key"
}

503 for temporary service-level unavailabilityhttp

HTTP/1.1 503 Service Unavailable
Content-Type: application/json
Retry-After: 120

{
  "error": "service_unavailable",
  "message": "Service is temporarily unavailable due to maintenance.",
  "status_page": "https://status.example.com"
}

Pro Tip

-Create separate SLO alerts for 429 and 503; ownership should route 429 to traffic-governance controls and 503 to reliability/platform incident response.

Frequently Asked Questions

If traffic is high, should I always return 503?

No. Use 429 when a caller cohort exceeds policy limits. Use 503 when the service is broadly unavailable independent of one caller identity.

Should both 429 and 503 include `Retry-After`?

They can. Include `Retry-After` when you can provide a meaningful retry window, and ensure clients treat it as an upper-bound pacing signal.

Are automatic retries safe after 429 or 503?

Only with bounded, jittered retries and idempotent operations. Unbounded retries on either status can amplify load and prolong incidents.