ThrottlingException
AWS ThrottlingException means the service rejected request rate or concurrency beyond allowed limits. Depending on service and protocol, this commonly returns HTTP 400 or HTTP 429.
Last reviewed: April 30, 2026|Source-backed guidance under our editorial policy
Start Here
Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.
This page is part of the Error Reference library. Learn more about the project or report a correction.
What Does Throttling Exception Mean?
The service is actively protecting itself from request pressure, so calls are rate-limited until client demand, retry behavior, and quota headroom return to a sustainable level.
Common Causes
- -Burst concurrency exceeds per-operation or per-account request quotas.
- -Retry logic amplifies load because backoff and jitter are missing or too aggressive.
- -Traffic concentrates on hot resources or partitions, saturating localized limits.
- -Planned traffic growth outpaced approved quota increases or regional capacity settings.
How to Fix Throttling Exception
- 1Implement exponential backoff with full jitter and cap total retry attempts.
- 2Throttle client concurrency at the caller to smooth request bursts.
- 3Honor service guidance such as Retry-After headers when provided.
- 4Request quota increases for sustained demand above current service limits.
Step-by-Step Diagnosis for Throttling Exception
- 1Inspect CloudWatch metrics for throttled requests, latency, and burst concurrency.
- 2Break down failures by API action, region, and principal to isolate bottlenecks.
- 3Trace retry fan-out in clients and queues to identify self-induced traffic storms.
- 4Correlate throttling spikes with deploys, backfills, and autoscaling transitions.
Demand and Burst Profiling
- -Profile request burst shape by operation and principal (example: one queue consumer shard spikes
GetItemat 10x baseline). - -Inspect partition-level hot spots and uneven key distribution (example: DynamoDB traffic concentrates on a small key range).
Retry and Backpressure Controls
- -Audit retry fan-out and jitter quality in every client path (example: synchronized retries at 1s intervals create periodic throttle waves).
- -Verify caller-side concurrency guards and queue drain limits (example: autoscaler doubles workers without per-worker API token bucket).
Seen in Production
Batch job scales worker count too quickly after queue spike
Frequency: common
Example: Hundreds of workers call the same API operation simultaneously and trigger immediate throttling.
Fix: Add worker ramp-up controls, per-operation rate limits, and jittered retries.
Regional failover doubles traffic into one AWS region
Frequency: rare
Example: Disaster-recovery test shifts production load to a region with lower pre-approved quota headroom.
Fix: Pre-approve failover quotas and validate throttling behavior in game-day load tests.
Debugging Tools
- -CloudWatch throttling metrics
- -Service Quotas console
- -Distributed trace retry analysis
- -Load-test harness
How to Verify the Fix
- -Confirm throttled request count drops and success rate stabilizes under expected load.
- -Validate p95/p99 latency recovers without introducing queue backlogs.
- -Re-run load tests to ensure request patterns stay within known quota headroom.
How to Prevent Recurrence
- -Use adaptive rate limiting and token-bucket controls in all high-volume clients.
- -Continuously monitor quota headroom and auto-open increase requests before saturation.
- -Design retry-safe idempotent write paths to avoid duplicate side effects under throttling.
Pro Tip
- -reserve a fixed percentage of quota headroom for incident traffic so failover/backfill events do not immediately saturate limits.
Decision Support
Compare Guide
AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED
Compare AWS ThrottlingException and GCP RESOURCE_EXHAUSTED to separate rate limiting from quota/resource exhaustion and choose the remediation path.
Playbook
Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)
Use this playbook to separate transient throttling from hard quota exhaustion and apply retry, traffic-shaping, and quota-capacity fixes safely.
Official References
Provider Context
This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.