Skip to content

Rate Limits

Rate limits protect the shared SaaS infrastructure. Self-hosted FOSS deployments have a single operator-tunable cap.

Every rate-limited response includes:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 37
X-RateLimit-Reset: 2026-04-24T17:43:00Z
Retry-After: 12

On a 429, Retry-After is the wait time in seconds before you should send the next request to the same endpoint. Clients must honour it; hammering through 429s will shorten the cooldown further or trigger the abuse limiter.

These caps are per-org, not per-key. They apply across all keys the org has issued.

Endpoint familyFreeProEnterprise
GET /monitors, /alerts, catalog reads60/min300/min1200/min
POST /events (inbound events)60/min600/min3000/min
POST /heartbeat/{monitorID}1/s per monitor1/s per monitor1/s per monitor
POST /cloudevents/ingest/{token}not available300/min1200/min
GET /cloudevents/pollnot available30/min60/min
GET /cloudevents/streamnot available5 concurrent10 concurrent
Reply-audit & admin reads60/min300/min1200/min
Write endpoints (create/update/delete)60/min120/min600/min

Knative Eventing endpoints are plan-gated: Free tier receives 402 plan_gate_blocked, not 429.

A single limit applies (default 60/min, configurable via YIPYAP_RATE_LIMIT_RPM). Operators tuning for load should raise this per the expected integration volume, not the number of users.

Replies that flow back into yipyap (sinks acknowledging, escalating, routing, etc.) are capped separately by type:

Reply typeProEnterpriseFOSS
run.yipyap.reply.alert.claimed.v160/min120/min120/min
run.yipyap.reply.alert.acknowledged.v130/min60/min60/min
run.yipyap.reply.alert.suppressed.v130/min60/min60/min
run.yipyap.reply.alert.escalated.v110/min30/min30/min
run.yipyap.reply.alert.route.v110/min30/min30/min
run.yipyap.reply.monitor.deregister.v15/min15/min15/min

See Knative Eventing → Bidirectional Alerts for the full catalog.

  1. Respect Retry-After. Always.
  2. Add jitter. After the Retry-After wait, sleep an additional random 0-500 ms before retrying.
  3. Cap retries. Three attempts per request is plenty. A persistent 429 indicates an upstream bug, not bad luck.
  4. Batch where possible. The CloudEvents batched-ingest endpoint accepts up to 256 events in one request; use it for bulk work instead of looping the single-event endpoint.

Long-lived endpoints (/cloudevents/stream) use a concurrency semaphore, not a rate counter. Opening more concurrent streams than your plan permits returns 429 rate_limited immediately; existing streams are unaffected.

We announce cap changes in the console changelog. Caps only ever increase under a given tier; we do not tighten existing tiers. New capabilities may land on higher tiers exclusively.