AWS

ThrottlingException

AWS ThrottlingException means the service rejected request rate or concurrency beyond allowed limits. Depending on service and protocol, this commonly returns HTTP 400 or HTTP 429.

Last reviewed: April 30, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

Compare: AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED Playbook: Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)Related: LimitExceededException Related: ServiceQuotaExceededException Related: InsufficientInstanceCapacity

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Throttling Exception Mean?

The service is actively protecting itself from request pressure, so calls are rate-limited until client demand, retry behavior, and quota headroom return to a sustainable level.

Common Causes

-Burst concurrency exceeds per-operation or per-account request quotas.
-Retry logic amplifies load because backoff and jitter are missing or too aggressive.
-Traffic concentrates on hot resources or partitions, saturating localized limits.
-Planned traffic growth outpaced approved quota increases or regional capacity settings.

How to Fix Throttling Exception

1Implement exponential backoff with full jitter and cap total retry attempts.
2Throttle client concurrency at the caller to smooth request bursts.
3Honor service guidance such as Retry-After headers when provided.
4Request quota increases for sustained demand above current service limits.

Step-by-Step Diagnosis for Throttling Exception

1Inspect CloudWatch metrics for throttled requests, latency, and burst concurrency.
2Break down failures by API action, region, and principal to isolate bottlenecks.
3Trace retry fan-out in clients and queues to identify self-induced traffic storms.
4Correlate throttling spikes with deploys, backfills, and autoscaling transitions.

Demand and Burst Profiling

-Profile request burst shape by operation and principal (example: one queue consumer shard spikes GetItem at 10x baseline).
-Inspect partition-level hot spots and uneven key distribution (example: DynamoDB traffic concentrates on a small key range).

Retry and Backpressure Controls

-Audit retry fan-out and jitter quality in every client path (example: synchronized retries at 1s intervals create periodic throttle waves).
-Verify caller-side concurrency guards and queue drain limits (example: autoscaler doubles workers without per-worker API token bucket).

Seen in Production

Batch job scales worker count too quickly after queue spike

Frequency: common

Example: Hundreds of workers call the same API operation simultaneously and trigger immediate throttling.

Fix: Add worker ramp-up controls, per-operation rate limits, and jittered retries.

Regional failover doubles traffic into one AWS region

Frequency: rare

Example: Disaster-recovery test shifts production load to a region with lower pre-approved quota headroom.

Fix: Pre-approve failover quotas and validate throttling behavior in game-day load tests.

Debugging Tools

-CloudWatch throttling metrics
-Service Quotas console
-Distributed trace retry analysis
-Load-test harness

How to Verify the Fix

-Confirm throttled request count drops and success rate stabilizes under expected load.
-Validate p95/p99 latency recovers without introducing queue backlogs.
-Re-run load tests to ensure request patterns stay within known quota headroom.

How to Prevent Recurrence

-Use adaptive rate limiting and token-bucket controls in all high-volume clients.
-Continuously monitor quota headroom and auto-open increase requests before saturation.
-Design retry-safe idempotent write paths to avoid duplicate side effects under throttling.

Pro Tip

-reserve a fixed percentage of quota headroom for incident traffic so failover/backfill events do not immediately saturate limits.

Decision Support

Compare Guide

AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED

Compare AWS ThrottlingException and GCP RESOURCE_EXHAUSTED to separate rate limiting from quota/resource exhaustion and choose the remediation path.

Playbook

Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)

Use this playbook to separate transient throttling from hard quota exhaustion and apply retry, traffic-shaping, and quota-capacity fixes safely.

Official References

Provider Context

This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.

Common Causes

-Burst concurrency exceeds per-operation or per-account request quotas.

-Retry logic amplifies load because backoff and jitter are missing or too aggressive.

-Traffic concentrates on hot resources or partitions, saturating localized limits.

-Planned traffic growth outpaced approved quota increases or regional capacity settings.

Step-by-Step Diagnosis for Throttling Exception

1Inspect CloudWatch metrics for throttled requests, latency, and burst concurrency.

2Break down failures by API action, region, and principal to isolate bottlenecks.

3Trace retry fan-out in clients and queues to identify self-induced traffic storms.

4Correlate throttling spikes with deploys, backfills, and autoscaling transitions.

Demand and Burst Profiling

-Profile request burst shape by operation and principal (example: one queue consumer shard spikes GetItem at 10x baseline).
-Inspect partition-level hot spots and uneven key distribution (example: DynamoDB traffic concentrates on a small key range).

Retry and Backpressure Controls

-Audit retry fan-out and jitter quality in every client path (example: synchronized retries at 1s intervals create periodic throttle waves).
-Verify caller-side concurrency guards and queue drain limits (example: autoscaler doubles workers without per-worker API token bucket).

Seen in Production

Batch job scales worker count too quickly after queue spike

Frequency: common

Example: Hundreds of workers call the same API operation simultaneously and trigger immediate throttling.

Fix: Add worker ramp-up controls, per-operation rate limits, and jittered retries.

Regional failover doubles traffic into one AWS region

Frequency: rare

Example: Disaster-recovery test shifts production load to a region with lower pre-approved quota headroom.

Fix: Pre-approve failover quotas and validate throttling behavior in game-day load tests.

Start Here

What Does Throttling Exception Mean?

Common Causes

How to Fix Throttling Exception

Step-by-Step Diagnosis for Throttling Exception

Demand and Burst Profiling

Retry and Backpressure Controls

Seen in Production

Batch job scales worker count too quickly after queue spike

Regional failover doubles traffic into one AWS region

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED

Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)

Official References

Provider Context

Start Here

What Does Throttling Exception Mean?

Common Causes

How to Fix Throttling Exception

Step-by-Step Diagnosis for Throttling Exception

Demand and Burst Profiling

Retry and Backpressure Controls

Seen in Production

Batch job scales worker count too quickly after queue spike

Regional failover doubles traffic into one AWS region

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Decision Support

AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED

Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)

Official References

Provider Context