Rate Limits
Rate limits protect the shared SaaS infrastructure. Self-hosted FOSS deployments have a single operator-tunable cap.
Headers
Section titled “Headers”Every rate-limited response includes:
X-RateLimit-Limit: 60X-RateLimit-Remaining: 37X-RateLimit-Reset: 2026-04-24T17:43:00ZRetry-After: 12On a 429, Retry-After is the wait time in seconds before you should send the next request to the same endpoint. Clients must honour it; hammering through 429s will shorten the cooldown further or trigger the abuse limiter.
Per-plan caps (SaaS)
Section titled “Per-plan caps (SaaS)”These caps are per-org, not per-key. They apply across all keys the org has issued.
| Endpoint family | Free | Pro | Enterprise |
|---|---|---|---|
GET /monitors, /alerts, catalog reads | 60/min | 300/min | 1200/min |
POST /events (inbound events) | 60/min | 600/min | 3000/min |
POST /heartbeat/{monitorID} | 1/s per monitor | 1/s per monitor | 1/s per monitor |
POST /cloudevents/ingest/{token} | not available | 300/min | 1200/min |
GET /cloudevents/poll | not available | 30/min | 60/min |
GET /cloudevents/stream | not available | 5 concurrent | 10 concurrent |
| Reply-audit & admin reads | 60/min | 300/min | 1200/min |
| Write endpoints (create/update/delete) | 60/min | 120/min | 600/min |
Knative Eventing endpoints are plan-gated: Free tier receives 402 plan_gate_blocked, not 429.
Per-plan caps (FOSS / self-hosted)
Section titled “Per-plan caps (FOSS / self-hosted)”A single limit applies (default 60/min, configurable via YIPYAP_RATE_LIMIT_RPM). Operators tuning for load should raise this per the expected integration volume, not the number of users.
Reply-dispatch caps (Knative Eventing)
Section titled “Reply-dispatch caps (Knative Eventing)”Replies that flow back into yipyap (sinks acknowledging, escalating, routing, etc.) are capped separately by type:
| Reply type | Pro | Enterprise | FOSS |
|---|---|---|---|
run.yipyap.reply.alert.claimed.v1 | 60/min | 120/min | 120/min |
run.yipyap.reply.alert.acknowledged.v1 | 30/min | 60/min | 60/min |
run.yipyap.reply.alert.suppressed.v1 | 30/min | 60/min | 60/min |
run.yipyap.reply.alert.escalated.v1 | 10/min | 30/min | 30/min |
run.yipyap.reply.alert.route.v1 | 10/min | 30/min | 30/min |
run.yipyap.reply.monitor.deregister.v1 | 5/min | 15/min | 15/min |
See Knative Eventing → Bidirectional Alerts for the full catalog.
Backoff strategy
Section titled “Backoff strategy”- Respect
Retry-After. Always. - Add jitter. After the
Retry-Afterwait, sleep an additional random 0-500 ms before retrying. - Cap retries. Three attempts per request is plenty. A persistent 429 indicates an upstream bug, not bad luck.
- Batch where possible. The CloudEvents batched-ingest endpoint accepts up to 256 events in one request; use it for bulk work instead of looping the single-event endpoint.
Concurrency
Section titled “Concurrency”Long-lived endpoints (/cloudevents/stream) use a concurrency semaphore, not a rate counter. Opening more concurrent streams than your plan permits returns 429 rate_limited immediately; existing streams are unaffected.
When limits change
Section titled “When limits change”We announce cap changes in the console changelog. Caps only ever increase under a given tier; we do not tighten existing tiers. New capabilities may land on higher tiers exclusively.