Use this playbook to separate transient throttling from hard quota exhaustion and apply retry, traffic-shaping, and quota-capacity fixes safely.
Last reviewed: February 24, 2026|Editorial standard: source-backed operational guidance
429 and ThrottlingException usually signal rate pacing. RESOURCE_EXHAUSTED often signals that a defined quota or rate bucket ran out at a specific scope.
Unbounded or synchronized retries can amplify load and trigger more throttling. Scope mismatch, shared-tenant contention, or hard quota ceilings can also keep failures active.
Use exponential backoff with jitter for retryable operations. Retry only idempotent actions unless you enforce idempotency keys for writes. Cap total attempts per request path to prevent retry storms. Stop retries when error class and quota signals show no short-window recovery path.
Request a quota increase when metrics show a sustained hard ceiling on a critical quota dimension under normal traffic. Reshape traffic first when burst throttling drives failures during short spikes or synchronized retry waves. Use pacing, concurrency caps, batching, and scheduling to restore headroom without quota changes. Request more quota only after traffic shaping fails to keep utilization below enforcement thresholds.