Tiers, headers, backoff

Rate limits

What you get per tier, how the headers tell you where you stand, and how to back off.

Limits are enforced per project, measured over a sliding 60-second window, with separate buckets for run creation and read traffic. Hitting a limit returns 429 with a Retry-After header — honor it; retrying early extends the window.

Limits by tier

tier	runs / min	reads / min	embeddings / min	concurrent deep runs
Free	10	60	300	1
Pro	60	600	3,000	4
Scale	300	3,000	20,000	16
Enterprise	custom	custom	custom	custom

Response headers

header	meaning
X-RateLimit-Limit	Bucket size for this endpoint class.
X-RateLimit-Remaining	Requests left in the current window.
X-RateLimit-Reset	Unix seconds until the window refills.
Retry-After	On 429 only — seconds to wait before retrying.

Backoff that behaves

Use exponential backoff with jitter, cap at 60 seconds, and treat Retry-After as a floor, not a suggestion. The SDKs do this automatically; raw HTTP integrations should copy the pattern.

[ ts ]ts

async function withBackoff<T>(fn: () => Promise<T>): Promise<T> {
  for (let attempt = 0; ; attempt++) {
    try {
      return await fn();
    } catch (err: unknown) {
      const e = err as { status?: number; retryAfter?: number };
      if (e.status !== 429 || attempt >= 5) throw err;
      const base = Math.max(e.retryAfter ?? 0, 2 ** attempt);
      await new Promise((r) =>
        setTimeout(r, Math.min(base + Math.random(), 60) * 1000),
      );
    }
  }
}

Keep exploring

[prev]WebhooksEvents, delivery, signatures