AWS

TooManyRequestsException - Too Many Requests

AWS TooManyRequestsException is a Lambda-specific throttling error indicating that the function has reached its concurrency limit at either the function or account level. The request is rejected immediately without execution to protect the regional infrastructure.

Last reviewed: March 15, 2026|Source-backed guidance under our editorial policy

Start Here

Use the closest compare guide, playbook, or adjacent error page to narrow the decision faster before you start changing production systems.

This page is part of the Error Reference library. Learn more about the project or report a correction.

What Does Too Many Requests Mean?

This exception is Lambda's "Air Traffic Control" signal. It triggers when the number of concurrent executions exceeds the allowed quota (default 1,000 per region). Unlike a 504 Timeout, the code never runs, saving you from compute costs but failing the request. It can be caused by account-wide exhaustion (unreserved pool) or reaching a specific function's Reserved Concurrency cap.

Common Causes

-Account-Level Exhaustion: A single "runaway" Lambda function consumes all 1,000 regional slots, throttling every other function in your account.
-Reserved Concurrency Ceiling: The function has a hard cap (e.g., 50) that is too low for sudden traffic bursts, even if the account has free capacity.
-Burst Limit Violations: Scaling up too fast (e.g., from 0 to 3,000 in seconds) exceeds the per-minute burst increase limit of the region.
-Upstream Over-Fanout: SQS or Kinesis triggers spawning Lambda instances faster than the function is allowed to scale.

How to Fix Too Many Requests

1Increase Regional Quota: Use AWS Service Quotas to request a limit increase if your baseline concurrency is consistently above 80% of the limit.
2Adjust Reserved Concurrency: Ensure critical functions have enough reserved capacity to handle spikes without being throttled by "noisy neighbors."
3Implement Jittered Backoff: Ensure calling services (API Gateway, SDKs) don't retry instantly, which creates a "retry storm" that sustains the throttle.
4Limit SQS Batch Size: Reduce the concurrency of the Event Source Mapping to smooth out the processing rate.

Step-by-Step Diagnosis for Too Many Requests

1Check CloudWatch Metrics: Look for Throttles and ConcurrentExecutions at both the function and account level.
2Verify "UnreservedConcurrentExecutions": If this is near zero, your account is exhausted; find the "noisy neighbor" function.
3Identify Burst Throttling: If ConcurrentExecutions is below the limit but Throttles is high, you are scaling faster than the regional burst limit allows.
4Check Usage Plans: Ensure API Gateway isn't the source of the 429 error before it even reaches Lambda.

Concurrency Type Comparison

-Reserved Concurrency: Guarantees capacity but caps the function. Use to protect critical paths.
-Provisioned Concurrency: Pre-warms instances. Use to eliminate cold starts AND throttling during predictable spikes.
-Unreserved Pool: The shared regional bucket. Risky for production-critical functions due to noisy neighbor effects.

Burst vs. Account Limits

-Burst Limit: Limits how fast you can scale (varies by region, e.g., 500-3000).
-Account Limit: Limits how much you can scale in total (default 1000).

Implementation Examples

AWS SDK v3 Custom Retry Logicjavascript

// Lambda throttles are transient; retrying with jitter is key
const config = {
  maxAttempts: 5,
  retryStrategy: new ConfiguredRetryStrategy(5, (attempt) => {
    return Math.min(100 * Math.pow(2, attempt) + Math.random() * 50, 5000);
  })
};

Checking Concurrency with Boto3python

import boto3
client = boto3.client('lambda')

# Check if we have reserved concurrency set
response = client.get_function_concurrency(FunctionName='my-function')
print(f"Reserved Concurrency: {response.get('ReservedConcurrentExecutions')}")

Seen in Production

The "Noisy Neighbor" Crash

Frequency: high

Example: A background logging Lambda has a bug and spawns 1,000 instances. Your main checkout Lambda starts failing with TooManyRequestsException because the account pool is empty.

Fix: Assign Reserved Concurrency to the checkout Lambda to "fence off" its needed capacity.

Debugging Tools

-CloudWatch Metrics: Specifically Throttles and UnreservedConcurrentExecutions.
-AWS Service Quotas Console: To view and request regional limit increases.
-X-Ray: To see where in a distributed trace the throttling is occurring.

How to Verify the Fix

-Monitor the Throttles metric in CloudWatch; it should return to a steady zero.
-Verify that the ConcurrentExecutions peaks remain safely below your new Service Quota limits.
-Test with an upstream load generator to ensure exponential backoff logic is effectively smoothing the traffic.

How to Prevent Recurrence

-Set CloudWatch Alarms: Trigger alerts when concurrency hits 70% of the regional or reserved limit.
-Use Provisioned Concurrency: For marketing campaigns or scheduled events where traffic spikes are known in advance.
-Leaky Bucket Pattern: Use API Gateway Throttling/Usage Plans to reject excess traffic at the edge before it hits Lambda.
-Pro-tip: Never set Reserved Concurrency to 0 unless you want to "kill-switch" a function during an incident.

Official References

Provider Context

This guidance is specific to AWS services. Always validate implementation details against official provider documentation before deploying to production.

What Does Too Many Requests Mean?

Common Causes

-Account-Level Exhaustion: A single "runaway" Lambda function consumes all 1,000 regional slots, throttling every other function in your account.

-Reserved Concurrency Ceiling: The function has a hard cap (e.g., 50) that is too low for sudden traffic bursts, even if the account has free capacity.

-Burst Limit Violations: Scaling up too fast (e.g., from 0 to 3,000 in seconds) exceeds the per-minute burst increase limit of the region.

-Upstream Over-Fanout: SQS or Kinesis triggers spawning Lambda instances faster than the function is allowed to scale.

How to Fix Too Many Requests

1Increase Regional Quota: Use AWS Service Quotas to request a limit increase if your baseline concurrency is consistently above 80% of the limit.

2Adjust Reserved Concurrency: Ensure critical functions have enough reserved capacity to handle spikes without being throttled by "noisy neighbors."

3Implement Jittered Backoff: Ensure calling services (API Gateway, SDKs) don't retry instantly, which creates a "retry storm" that sustains the throttle.

4Limit SQS Batch Size: Reduce the concurrency of the Event Source Mapping to smooth out the processing rate.

Step-by-Step Diagnosis for Too Many Requests

1Check CloudWatch Metrics: Look for Throttles and ConcurrentExecutions at both the function and account level.

2Verify "UnreservedConcurrentExecutions": If this is near zero, your account is exhausted; find the "noisy neighbor" function.

3Identify Burst Throttling: If ConcurrentExecutions is below the limit but Throttles is high, you are scaling faster than the regional burst limit allows.

4Check Usage Plans: Ensure API Gateway isn't the source of the 429 error before it even reaches Lambda.

Concurrency Type Comparison

-Reserved Concurrency: Guarantees capacity but caps the function. Use to protect critical paths.
-Provisioned Concurrency: Pre-warms instances. Use to eliminate cold starts AND throttling during predictable spikes.
-Unreserved Pool: The shared regional bucket. Risky for production-critical functions due to noisy neighbor effects.

Burst vs. Account Limits

-Burst Limit: Limits how fast you can scale (varies by region, e.g., 500-3000).
-Account Limit: Limits how much you can scale in total (default 1000).

Implementation Examples

AWS SDK v3 Custom Retry Logicjavascript

// Lambda throttles are transient; retrying with jitter is key
const config = {
  maxAttempts: 5,
  retryStrategy: new ConfiguredRetryStrategy(5, (attempt) => {
    return Math.min(100 * Math.pow(2, attempt) + Math.random() * 50, 5000);
  })
};

Checking Concurrency with Boto3python

import boto3
client = boto3.client('lambda')

# Check if we have reserved concurrency set
response = client.get_function_concurrency(FunctionName='my-function')
print(f"Reserved Concurrency: {response.get('ReservedConcurrentExecutions')}")

Seen in Production

The "Noisy Neighbor" Crash

Frequency: high

Example: A background logging Lambda has a bug and spawns 1,000 instances. Your main checkout Lambda starts failing with TooManyRequestsException because the account pool is empty.

Fix: Assign Reserved Concurrency to the checkout Lambda to "fence off" its needed capacity.

How to Prevent Recurrence

-Set CloudWatch Alarms: Trigger alerts when concurrency hits 70% of the regional or reserved limit.

-Use Provisioned Concurrency: For marketing campaigns or scheduled events where traffic spikes are known in advance.

-Leaky Bucket Pattern: Use API Gateway Throttling/Usage Plans to reject excess traffic at the edge before it hits Lambda.

-Pro-tip: Never set Reserved Concurrency to 0 unless you want to "kill-switch" a function during an incident.

Start Here

What Does Too Many Requests Mean?

Common Causes

How to Fix Too Many Requests

Step-by-Step Diagnosis for Too Many Requests

Concurrency Type Comparison

Burst vs. Account Limits

Implementation Examples

Seen in Production

The "Noisy Neighbor" Crash

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Related Errors

Official References

Provider Context

Start Here

What Does Too Many Requests Mean?

Common Causes

How to Fix Too Many Requests

Step-by-Step Diagnosis for Too Many Requests

Concurrency Type Comparison

Burst vs. Account Limits

Implementation Examples

Seen in Production

The "Noisy Neighbor" Crash

Debugging Tools

How to Verify the Fix

How to Prevent Recurrence

Related Errors

Official References

Provider Context