SubscriptionRequestsThrottled
ARM returns `SubscriptionRequestsThrottled` when subscription or tenant request limits are exceeded and the control plane responds with throttling.
Last reviewed: February 12, 2026|Editorial standard: source-backed technical guidance
What Does Subscription Requests Throttled Mean?
Control-plane calls are rate-limited, so deployments and automation slow down or fail until request volume returns below throttling thresholds.
Common Causes
- -High parallel deployment activity saturates subscription-level ARM request buckets.
- -Retry storms ignore server pacing and amplify control-plane pressure.
- -Multiple service principals hit the same throttled operation family simultaneously.
- -Metrics-heavy polling patterns consume disproportionate ARM request capacity.
How to Fix Subscription Requests Throttled
- 1Honor `Retry-After` exactly and pause retries until the server-advised window expires.
- 2Apply exponential backoff with jitter and strict retry budgets in all callers.
- 3Reduce concurrent deployment workers and batch high-volume management reads.
- 4Move high-frequency metrics collection to batch APIs where available.
Step-by-Step Diagnosis for Subscription Requests Throttled
- 1Capture failing responses, including 429 status, `Retry-After`, and request correlation IDs.
- 2Identify which operation type and scope are throttled (subscription, tenant, or provider-specific).
- 3Measure caller concurrency, retry fan-out, and burst windows during throttling events.
- 4Retest with lower concurrency and compliant backoff to confirm throttling recovery.
Throttle Scope Attribution
- -Distinguish subscription-wide throttling from provider-specific throttling (example: `Microsoft.Compute` operations throttled while unrelated provider calls still succeed).
- -Correlate throttled requests with operation types and principal identities (example: read-heavy inventory job drains tokens before deployment window).
Client Retry and Concurrency Shaping
- -Audit retry logic for missing jitter or unbounded retries (example: synchronized workers retry at the same second and recreate bursts).
- -Trace rollout orchestration fan-out (example: one release launches parallel resource-group deploys beyond safe request budget).
How to Verify the Fix
- -Repeat throttled workflows and confirm 429 responses drop to expected baseline levels.
- -Verify end-to-end deployment latency stabilizes after backoff and concurrency adjustments.
- -Track ARM metrics to ensure sustained operation below throttling thresholds.
How to Prevent Recurrence
- -Define per-environment request budgets and enforce them in deployment orchestrators.
- -Use centralized retry middleware with jitter, cap, and circuit-breaker behavior.
- -Schedule bursty control-plane jobs outside critical deployment windows.
Pro Tip
- -stagger pipelines by operation class (read/list vs write/deploy) so token buckets refill predictably instead of competing in the same burst window.
Decision Support
Compare Guide
429 Too Many Requests vs 503 Service Unavailable
Use 429 for caller-specific throttling and 503 for service-wide outages, so retry behavior, escalation paths, and incident ownership stay correct.
Compare Guide
AWS ThrottlingException vs GCP RESOURCE_EXHAUSTED
Compare AWS ThrottlingException and GCP RESOURCE_EXHAUSTED to separate rate limiting from quota/resource exhaustion and choose the remediation path.
Playbook
Rate Limit Recovery Playbook (429 / ThrottlingException / RESOURCE_EXHAUSTED)
Use this playbook to separate transient throttling from hard quota exhaustion and apply retry, traffic-shaping, and quota-capacity fixes safely.
Official References
Provider Context
This guidance is specific to Azure services. Always validate implementation details against official provider documentation before deploying to production.