Rate Limits

Rate limits protect the shared SaaS infrastructure. Self-hosted FOSS deployments have a single operator-tunable cap.

Headers

Every rate-limited response includes:

X-RateLimit-Limit:      60
X-RateLimit-Remaining:  37
X-RateLimit-Reset:      2026-04-24T17:43:00Z
Retry-After:            12

On a 429, Retry-After is the wait time in seconds before you should send the next request to the same endpoint. Clients must honour it; hammering through 429s will shorten the cooldown further or trigger the abuse limiter.

Per-plan caps (SaaS)

These caps are per-org, not per-key. They apply across all keys the org has issued.

Endpoint family	Free	Pro	Enterprise
`GET /monitors`, `/alerts`, catalog reads	60/min	300/min	1200/min
`POST /events` (inbound events)	60/min	600/min	3000/min
`POST /heartbeat/{monitorID}`	1/s per monitor	1/s per monitor	1/s per monitor
`POST /cloudevents/ingest/{token}`	not available	300/min	1200/min
`GET /cloudevents/poll`	not available	30/min	60/min
`GET /cloudevents/stream`	not available	5 concurrent	10 concurrent
Reply-audit & admin reads	60/min	300/min	1200/min
Write endpoints (create/update/delete)	60/min	120/min	600/min

Knative Eventing endpoints are plan-gated: Free tier receives 402 plan_gate_blocked, not 429.

Per-plan caps (FOSS / self-hosted)

A single limit applies (default 60/min, configurable via YIPYAP_RATE_LIMIT_RPM). Operators tuning for load should raise this per the expected integration volume, not the number of users.

Reply-dispatch caps (Knative Eventing)

Replies that flow back into yipyap (sinks acknowledging, escalating, routing, etc.) are capped separately by type:

Reply type	Pro	Enterprise	FOSS
`run.yipyap.reply.alert.claimed.v1`	60/min	120/min	120/min
`run.yipyap.reply.alert.acknowledged.v1`	30/min	60/min	60/min
`run.yipyap.reply.alert.suppressed.v1`	30/min	60/min	60/min
`run.yipyap.reply.alert.escalated.v1`	10/min	30/min	30/min
`run.yipyap.reply.alert.route.v1`	10/min	30/min	30/min
`run.yipyap.reply.monitor.deregister.v1`	5/min	15/min	15/min

See Knative Eventing → Bidirectional Alerts for the full catalog.

Backoff strategy

Respect Retry-After. Always.
Add jitter. After the Retry-After wait, sleep an additional random 0-500 ms before retrying.
Cap retries. Three attempts per request is plenty. A persistent 429 indicates an upstream bug, not bad luck.
Batch where possible. The CloudEvents batched-ingest endpoint accepts up to 256 events in one request; use it for bulk work instead of looping the single-event endpoint.

Concurrency

Long-lived endpoints (/cloudevents/stream) use a concurrency semaphore, not a rate counter. Opening more concurrent streams than your plan permits returns 429 rate_limited immediately; existing streams are unaffected.

When limits change

We announce cap changes in the console changelog. Caps only ever increase under a given tier; we do not tighten existing tiers. New capabilities may land on higher tiers exclusively.