429 - Too Many Requests
HTTP 429 Too Many Requests means a limiter rejected this request because the caller exceeded a rate, quota, or burst policy for the active window.
Last reviewed: April 12, 2026|Source-backed guidance under our editorial policy
Start Here
Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.
This page is part of the Error Reference library. Learn more about the project or report a correction.
What Does Too Many Requests Mean?
Treat 429 as a pacing signal, not a generic outage. The useful tuple is limit dimension, current burst, and retry policy: which bucket is exhausted, what traffic caused it, and how clients react after the throttle.
Common Causes
- -Limiter keys by source IP or API key only, so many users behind one NAT or one integration credential exhaust a shared budget.
- -Clients retry with fixed delay or no jitter, creating synchronized retry storms that refill more slowly than requests arrive.
- -Batch jobs and interactive traffic share the same quota bucket, causing predictable top-of-hour spikes and sustained 429s.
- -Webhook replay, mobile offline sync, or worker fan-out floods the same route faster than the token bucket can refill.
- -Client-side throttling is missing, so every worker independently assumes quota is available and amplifies burst traffic.
How to Fix Too Many Requests
- 1Capture
Retry-After, rate-limit headers, request dimension, and route before changing retry logic. - 2Immediately reduce concurrency and switch clients to bounded exponential backoff with jitter and strict retry budgets.
- 3Separate background and interactive traffic if they share one principal, key, or tenant budget.
- 4Replay with capped concurrency and verify the same route stays below the active limiter window.
Step-by-Step Diagnosis for Too Many Requests
- 1Capture throttling by dimension: principal, API key, IP, tenant, route, worker pool, and time window.
- 2Inspect
Retry-Afterplus anyRateLimit-*or provider-specific headers to identify reset semantics and effective bucket size. - 3Trace retry logic for synchronized retries, missing jitter, unbounded retries, and top-of-minute worker fan-out.
- 4Differentiate edge-level throttling from origin-level throttling so the correct limiter owner is debugging the incident.
- 5Retest with capped concurrency, client-side pacing, and queue-based smoothing for background jobs.
- 6Verify whether one noisy tenant or one shared credential is starving unrelated traffic inside the same quota dimension.
Seen in Production
- -API responds with
Retry-After: 30andRateLimit-Remaining: 0after a batch job starts 200 workers at the top of the hour. - -One customer behind a corporate NAT exhausts a shared per-IP limit, while the same requests succeed from residential or mobile networks.
- -A transient dependency timeout causes every client to retry at exactly one second, creating a second-wave 429 storm after the original blip is gone.
Retry Budget and Limit Dimension
- -Identify the real saturated bucket first: per-IP, per-user, per-tenant, per-route, per-token, or global origin limiter.
- -Compare the server limiter dimension to the client pacing dimension, because client-side throttling keyed differently from the server often fails to prevent bursts.
Wrong Fix to Avoid
- -Do not simply increase retry count when 429 appears; that usually intensifies the throttle event.
- -Do not raise every quota globally until you know whether one route, one tenant, or one buggy worker pool is the real source of the burst.
Implementation Examples
GET /v1/reports/heavy HTTP/1.1
Host: api.example.com
HTTP/1.1 429 Too Many Requests
Retry-After: 30
RateLimit-Limit: 120
RateLimit-Remaining: 0
RateLimit-Reset: 30
Content-Type: application/json
{
"error": "Too Many Requests",
"bucket": "tenant:acme:reports",
"requestId": "req_31dc90"
}const baseDelayMs = 1000;
const maxRetries = 5;
for (let attempt = 0; attempt < maxRetries; attempt += 1) {
const response = await fetch('https://api.example.com/v1/reports/heavy');
if (response.status !== 429) break;
const retryAfter = Number(response.headers.get('retry-after') ?? 0) * 1000;
const jitter = Math.floor(Math.random() * 250);
const backoff = Math.min(baseDelayMs * 2 ** attempt, 30000);
await new Promise((resolve) => setTimeout(resolve, Math.max(retryAfter, backoff) + jitter));
}Seen in Production
Background workers saturate a shared tenant budget
Frequency: common
Example: Batch export spins up many parallel requests and consumes the same tenant-level budget used by the UI, producing sustained 429.
Fix: Queue bulk work with adaptive concurrency and respect Retry-After across every worker sharing that tenant budget.
Retry storm amplifies a brief dependency blip
Frequency: common
Example: Clients retry with fixed one-second delay after a transient network incident, causing a second-wave 429 storm after the original error clears.
Fix: Use exponential backoff with jitter and shared retry budgets so recovery traffic does not synchronize.
Shared NAT or integration key makes unrelated users throttle together
Frequency: medium
Example: Many users behind one egress IP or one partner API key hit the same per-principal bucket and all receive 429 during busy windows.
Fix: Re-key the limiter to a fairer principal where possible or separate traffic classes before increasing quotas.
Debugging Tools
- -Rate-limit header telemetry (
Retry-After,RateLimit-*, provider-specific resets) - -Per-principal and per-route request-rate dashboards
- -Client retry trace logs with attempt counts and jitter timings
- -Load-test or replay tools for burst simulation
- -Queue depth and worker concurrency dashboards
How to Verify the Fix
- -Run a controlled replay or load test and confirm the target traffic profile no longer produces 429 on the same route.
- -Validate clients honor
Retry-After, cap retries, and spread retries with jitter instead of synchronized bursts. - -Monitor request rate, limiter headroom, and queue depth to confirm the system keeps steady margin below enforcement.
- -Check that background traffic is isolated enough that interactive traffic is not starved during spikes.
How to Prevent Recurrence
- -Standardize retry behavior with bounded exponential backoff, jitter, idempotency, and explicit retry budgets.
- -Alert before quota exhaustion by tracking the same limiter dimensions the server enforces.
- -Use queues, adaptive concurrency, and traffic class isolation to smooth bulk traffic away from user-facing flows.
Pro Tip
- -implement distributed client-side pacing keyed by the same principal dimension as the server limiter so one noisy worker pool cannot amplify bursts across the fleet.
Decision Support
Compare Guide
429 Too Many Requests vs 503 Service Unavailable
Use 429 for caller-specific throttling and 503 for service-wide outages, so retry behavior, escalation paths, and incident ownership stay correct.
Playbook
Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)
Use this playbook to separate transient throttling from hard quota exhaustion and apply retry, traffic-shaping, and quota-capacity fixes safely.
Official References
Provider Context
This guidance is specific to HTTP services. Always validate implementation details against official provider documentation before deploying to production.