AWS

RequestLimitExceeded

AWS RequestLimitExceeded means an AWS API request-rate or account-level request quota was exceeded for the target service, operation, region, or account context.

Last reviewed: April 30, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Request Limit Exceeded Mean?

AWS accepted the request far enough to classify it, then blocked execution because the caller is exceeding a request-rate control. This is different from resource-count quota errors and different from DynamoDB table throughput errors. The useful boundary is attribution: which service, operation, account, region, principal, retry path, and scheduled workload is consuming the request budget.

Common Causes

-High-concurrency clients or Lambda fan-out submit the same AWS API operation faster than the regional/account limit allows.
-Retry logic amplifies load because callers retry immediately or synchronize around the same backoff interval.
-Health checks, inventory collectors, or IaC verification loops poll control-plane APIs such as describe/list operations too aggressively.
-Scheduled backup, deployment, or cleanup jobs start at the same minute and consume shared request budget.
-Traffic shifts or regional failover move workload into an account/region with lower approved quota headroom.

How to Fix Request Limit Exceeded

1Identify the exact service, API operation, region, account, and principal producing RequestLimitExceeded.
2Enable standard or adaptive AWS SDK retry mode with capped exponential backoff and full jitter.
3Add caller-side concurrency limits or token buckets per operation before retry traffic reaches AWS.
4Cache stable metadata and replace frequent describe/list polling with event-driven or scheduled refreshes.
5Stagger batch, backup, inventory, and IaC jobs; request quota increases only after measured sustained demand is clear.

Step-by-Step Diagnosis for Request Limit Exceeded

1Capture request ID, service endpoint, operation name, region, account ID, principal ARN, retry attempt, and SDK retry mode from failing calls.
2Use CloudTrail, service metrics, and application traces to rank callers by request volume during the failure window.
3Correlate spikes with autoscaling, Lambda cold starts, deployment waves, scheduled jobs, backfills, or regional failover.
4Separate request-rate throttling from table throughput, hard resource quota, and backend availability errors.
5Replay under controlled concurrency after adding backoff and caller-side limits to confirm the request budget recovers.

Seen in Production

-A Lambda fleet calls DescribeTable on every invocation and hits regional control-plane request limits during scale-out.
-A nightly backup script starts CreateBackup across hundreds of resources at the same second.
-A CI pipeline runs parallel Terraform plans that all verify the same AWS resources with describe/list calls.
-An incident failover doubles API traffic into one region where quota headroom was sized only for baseline traffic.

Request Budget Attribution

-Break down throttles by service, operation, account, region, and principal before changing quotas.
-Identify whether traffic comes from user requests, worker backfills, health checks, inventory collectors, or IaC tooling.

Retry and Concurrency Discipline

-Inspect retry mode, max attempts, jitter quality, and retry budget across every caller.
-Add per-operation token buckets so one noisy workflow cannot consume the entire shared request budget.

Decision Shortcut: Rate vs Capacity vs Quota

-If API request rate is too high, stay on RequestLimitExceeded and shape caller demand.
-If DynamoDB read/write capacity on a table is exhausted, inspect ProvisionedThroughputExceededException.
-If a hard resource quota is exhausted, inspect ServiceQuotaExceededException or LimitExceededException.

Wrong Fix to Avoid

-Do not increase retry counts without adding jitter and total retry budgets.
-Do not request quota increases before identifying the caller and operation that consumes the budget.
-Do not poll stable metadata in every request path when cached or event-driven state is sufficient.

Implementation Examples

Cache metadata outside the hot Lambda pathjavascript

// Define cache OUTSIDE the handler
const metadataCache = new Map();

export const handler = async (event) => {
  const tableName = "Orders";
  let metadata = metadataCache.get(tableName);

  if (!metadata || (Date.now() - metadata.ts > 300000)) {
    const data = await ddbClient.send(new DescribeTableCommand({ TableName: tableName }));
    metadata = { data: data.Table, ts: Date.now() };
    metadataCache.set(tableName, metadata);
  }
  
  return metadata.data;
};

Find noisy callers in CloudTrail during the throttle windowbash

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeTable \
  --start-time 2026-04-30T07:25:00Z \
  --end-time 2026-04-30T07:45:00Z \
  --query 'Events[].Username'

Enable adaptive retry mode for AWS SDK clientsbash

export AWS_RETRY_MODE=adaptive
export AWS_MAX_ATTEMPTS=5

Incident Timeline

07:31 UTC

Caller concurrency or polling ramps up

Signal: Autoscaling, scheduled jobs, deployment tooling, or retry fan-out increases calls to one AWS API operation.

Why it matters: The request budget starts being consumed by a specific caller and operation.

07:32 UTC

AWS returns RequestLimitExceeded

Signal: Requests fail even though payloads and credentials are valid.

Why it matters: The next action is request attribution and demand shaping, not payload debugging.

07:36 UTC

Retries amplify the pressure

Signal: Clients without jitter retry together and create waves of additional throttled requests.

Why it matters: Retry behavior becomes part of the incident if it is not budgeted.

07:48 UTC

Caller-side rate limits restore headroom

Signal: Concurrency is capped, metadata calls are cached, and non-critical jobs are staggered.

Why it matters: Sustained quota increases can be evaluated after the request storm is controlled.

Seen in Production

Health checks flood control-plane describe APIs

Frequency: common

Example: A fleet repeatedly calls DescribeTable or equivalent describe/list APIs on every health check.

Fix: Cache metadata and move liveness checks to lightweight data-plane or local health signals.

Failover doubles API traffic into one region

Frequency: medium

Example: Workers from two regions drain into the same AWS region after failover and exceed approved request-rate headroom.

Fix: Apply per-region token buckets and pre-request quota increases for failover scenarios.

Wrong Fix vs Better Fix

More retries vs retry budget

Wrong fix: Increase max retry attempts globally.

Better fix: Use capped exponential backoff with full jitter and a per-operation retry budget.

Why this is better: More unbounded retries consume the same request budget faster.

Quota request first vs caller attribution first

Wrong fix: Open a quota increase request before knowing which operation is noisy.

Better fix: Rank callers and operations, remove accidental polling, then request quota for sustained legitimate demand.

Why this is better: Many RequestLimitExceeded incidents are self-inflicted by polling or retry storms.

Poll every request vs cache metadata

Wrong fix: Call describe/list APIs on every Lambda invocation or health check.

Better fix: Cache metadata with TTL, load it outside hot handlers, or refresh from events.

Why this is better: Control-plane metadata usually changes slowly and should not consume request budget on every user path.

Debugging Tools

-CloudTrail lookup by event name and principal
-CloudWatch throttled request metrics
-AWS SDK retry-mode logs
-X-Ray or distributed traces for duplicate metadata calls
-Service Quotas usage and limit views

How to Verify the Fix

-Confirm RequestLimitExceeded drops for the affected operation under representative traffic.
-Validate p95/p99 latency and success rate recover without creating queue backlogs.
-Verify retry attempts per successful operation fall after jitter, caching, and concurrency limits ship.
-Run a controlled load test that includes scheduled jobs and failover traffic against known quota headroom.

How to Prevent Recurrence

-Set per-service and per-operation request budgets in clients, workers, and deployment tooling.
-Cache stable AWS metadata and centralize inventory snapshots instead of polling from every service.
-Stagger maintenance, backup, and IaC workflows with queues or schedules that respect quota headroom.
-Monitor throttles by operation/principal and alert before retry amplification affects user traffic.

Pro Tip

-reserve emergency request-rate headroom by throttling non-critical background jobs whenever request utilization crosses a defined threshold.

Official References

Provider Context

This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.

AWS

RequestLimitExceeded

AWS RequestLimitExceeded means an AWS API request-rate or account-level request quota was exceeded for the target service, operation, region, or account context.

Last reviewed: April 30, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Request Limit Exceeded Mean?

Common Causes

-High-concurrency clients or Lambda fan-out submit the same AWS API operation faster than the regional/account limit allows.
-Retry logic amplifies load because callers retry immediately or synchronize around the same backoff interval.
-Health checks, inventory collectors, or IaC verification loops poll control-plane APIs such as describe/list operations too aggressively.
-Scheduled backup, deployment, or cleanup jobs start at the same minute and consume shared request budget.
-Traffic shifts or regional failover move workload into an account/region with lower approved quota headroom.

How to Fix Request Limit Exceeded

1Identify the exact service, API operation, region, account, and principal producing RequestLimitExceeded.
2Enable standard or adaptive AWS SDK retry mode with capped exponential backoff and full jitter.
3Add caller-side concurrency limits or token buckets per operation before retry traffic reaches AWS.
4Cache stable metadata and replace frequent describe/list polling with event-driven or scheduled refreshes.
5Stagger batch, backup, inventory, and IaC jobs; request quota increases only after measured sustained demand is clear.

Step-by-Step Diagnosis for Request Limit Exceeded

1Capture request ID, service endpoint, operation name, region, account ID, principal ARN, retry attempt, and SDK retry mode from failing calls.
2Use CloudTrail, service metrics, and application traces to rank callers by request volume during the failure window.
3Correlate spikes with autoscaling, Lambda cold starts, deployment waves, scheduled jobs, backfills, or regional failover.
4Separate request-rate throttling from table throughput, hard resource quota, and backend availability errors.
5Replay under controlled concurrency after adding backoff and caller-side limits to confirm the request budget recovers.

Seen in Production

-A Lambda fleet calls DescribeTable on every invocation and hits regional control-plane request limits during scale-out.
-A nightly backup script starts CreateBackup across hundreds of resources at the same second.
-A CI pipeline runs parallel Terraform plans that all verify the same AWS resources with describe/list calls.
-An incident failover doubles API traffic into one region where quota headroom was sized only for baseline traffic.

Request Budget Attribution

-Break down throttles by service, operation, account, region, and principal before changing quotas.
-Identify whether traffic comes from user requests, worker backfills, health checks, inventory collectors, or IaC tooling.

Retry and Concurrency Discipline

-Inspect retry mode, max attempts, jitter quality, and retry budget across every caller.
-Add per-operation token buckets so one noisy workflow cannot consume the entire shared request budget.

Decision Shortcut: Rate vs Capacity vs Quota

-If API request rate is too high, stay on RequestLimitExceeded and shape caller demand.
-If DynamoDB read/write capacity on a table is exhausted, inspect ProvisionedThroughputExceededException.
-If a hard resource quota is exhausted, inspect ServiceQuotaExceededException or LimitExceededException.

Wrong Fix to Avoid

-Do not increase retry counts without adding jitter and total retry budgets.
-Do not request quota increases before identifying the caller and operation that consumes the budget.
-Do not poll stable metadata in every request path when cached or event-driven state is sufficient.

Implementation Examples

Cache metadata outside the hot Lambda pathjavascript

// Define cache OUTSIDE the handler
const metadataCache = new Map();

export const handler = async (event) => {
  const tableName = "Orders";
  let metadata = metadataCache.get(tableName);

  if (!metadata || (Date.now() - metadata.ts > 300000)) {
    const data = await ddbClient.send(new DescribeTableCommand({ TableName: tableName }));
    metadata = { data: data.Table, ts: Date.now() };
    metadataCache.set(tableName, metadata);
  }
  
  return metadata.data;
};

Find noisy callers in CloudTrail during the throttle windowbash

aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=EventName,AttributeValue=DescribeTable \
  --start-time 2026-04-30T07:25:00Z \
  --end-time 2026-04-30T07:45:00Z \
  --query 'Events[].Username'

Enable adaptive retry mode for AWS SDK clientsbash

export AWS_RETRY_MODE=adaptive
export AWS_MAX_ATTEMPTS=5

Incident Timeline

07:31 UTC

Caller concurrency or polling ramps up

Signal: Autoscaling, scheduled jobs, deployment tooling, or retry fan-out increases calls to one AWS API operation.

Why it matters: The request budget starts being consumed by a specific caller and operation.

07:32 UTC

AWS returns RequestLimitExceeded

Signal: Requests fail even though payloads and credentials are valid.

Why it matters: The next action is request attribution and demand shaping, not payload debugging.

07:36 UTC

Retries amplify the pressure

Signal: Clients without jitter retry together and create waves of additional throttled requests.

Why it matters: Retry behavior becomes part of the incident if it is not budgeted.

07:48 UTC

Caller-side rate limits restore headroom

Signal: Concurrency is capped, metadata calls are cached, and non-critical jobs are staggered.

Why it matters: Sustained quota increases can be evaluated after the request storm is controlled.

Seen in Production

Health checks flood control-plane describe APIs

Frequency: common

Example: A fleet repeatedly calls DescribeTable or equivalent describe/list APIs on every health check.

Fix: Cache metadata and move liveness checks to lightweight data-plane or local health signals.

Failover doubles API traffic into one region

Frequency: medium

Example: Workers from two regions drain into the same AWS region after failover and exceed approved request-rate headroom.

Fix: Apply per-region token buckets and pre-request quota increases for failover scenarios.

Wrong Fix vs Better Fix

More retries vs retry budget

Wrong fix: Increase max retry attempts globally.

Better fix: Use capped exponential backoff with full jitter and a per-operation retry budget.

Why this is better: More unbounded retries consume the same request budget faster.

Quota request first vs caller attribution first

Wrong fix: Open a quota increase request before knowing which operation is noisy.

Better fix: Rank callers and operations, remove accidental polling, then request quota for sustained legitimate demand.

Why this is better: Many RequestLimitExceeded incidents are self-inflicted by polling or retry storms.

Poll every request vs cache metadata

Wrong fix: Call describe/list APIs on every Lambda invocation or health check.

Better fix: Cache metadata with TTL, load it outside hot handlers, or refresh from events.

Why this is better: Control-plane metadata usually changes slowly and should not consume request budget on every user path.

Debugging Tools

-CloudTrail lookup by event name and principal
-CloudWatch throttled request metrics
-AWS SDK retry-mode logs
-X-Ray or distributed traces for duplicate metadata calls
-Service Quotas usage and limit views

How to Verify the Fix

-Confirm RequestLimitExceeded drops for the affected operation under representative traffic.
-Validate p95/p99 latency and success rate recover without creating queue backlogs.
-Verify retry attempts per successful operation fall after jitter, caching, and concurrency limits ship.
-Run a controlled load test that includes scheduled jobs and failover traffic against known quota headroom.

How to Prevent Recurrence

-Set per-service and per-operation request budgets in clients, workers, and deployment tooling.
-Cache stable AWS metadata and centralize inventory snapshots instead of polling from every service.
-Stagger maintenance, backup, and IaC workflows with queues or schedules that respect quota headroom.
-Monitor throttles by operation/principal and alert before retry amplification affects user traffic.

Pro Tip

-reserve emergency request-rate headroom by throttling non-critical background jobs whenever request utilization crosses a defined threshold.

Official References

Provider Context

This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.

Start Here

What Does Request Limit Exceeded Mean?

Common Causes

How to Fix Request Limit Exceeded

Step-by-Step Diagnosis for Request Limit Exceeded

Seen in Production

Request Budget Attribution

Retry and Concurrency Discipline

Decision Shortcut: Rate vs Capacity vs Quota

Wrong Fix to Avoid

Implementation Examples

Incident Timeline

Caller concurrency or polling ramps up

AWS returns RequestLimitExceeded

Retries amplify the pressure

Caller-side rate limits restore headroom

Seen in Production

Health checks flood control-plane describe APIs

Failover doubles API traffic into one region

Wrong Fix vs Better Fix

More retries vs retry budget

Quota request first vs caller attribution first

Poll every request vs cache metadata

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Official References

Provider Context

Start Here

What Does Request Limit Exceeded Mean?

Common Causes

How to Fix Request Limit Exceeded

Step-by-Step Diagnosis for Request Limit Exceeded

Seen in Production

Request Budget Attribution

Retry and Concurrency Discipline

Decision Shortcut: Rate vs Capacity vs Quota

Wrong Fix to Avoid

Implementation Examples

Incident Timeline

Caller concurrency or polling ramps up

AWS returns RequestLimitExceeded

Retries amplify the pressure

Caller-side rate limits restore headroom

Seen in Production

Health checks flood control-plane describe APIs

Failover doubles API traffic into one region

Wrong Fix vs Better Fix

More retries vs retry budget

Quota request first vs caller attribution first

Poll every request vs cache metadata

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Pro Tip

Related Errors

Official References

Provider Context