Budgets and Circuit Breakers
cf-monitor prevents runaway costs with three layers of protection: per-invocation limits, daily/monthly budgets, and circuit breakers.
cf-monitor prevents runaway costs with three layers of protection: per-invocation limits, daily/monthly budgets, and circuit breakers.
Per-invocation limits (Layer 1)
The first line of defence. These limits are enforced synchronously — the moment a binding operation exceeds the limit, a RequestBudgetExceededError is thrown. No waiting for a cron, no eventual consistency. The runaway loop stops on the first request.
Default limits
| Metric | Default limit | What it protects |
|---|---|---|
d1Writes | 1,000 | Prevents infinite INSERT loops |
d1Reads | 5,000 | Prevents unbounded SELECT scans |
kvWrites | 200 | KV writes cost 10x reads ($5/M) |
kvReads | 1,000 | Prevents KV read floods |
aiRequests | 50 | AI calls are expensive |
r2ClassA | 100 | R2 mutations (put, delete) |
queueMessages | 500 | Prevents message storms |
Custom limits
import { monitor } from '@littlebearapps/cf-monitor';
export default monitor({
limits: {
d1Writes: 500, // Tighter than default
aiRequests: 10, // Very conservative for AI
},
fetch: handler,
});
Handling the error
When a limit is exceeded, RequestBudgetExceededError is thrown. By default, monitor() catches it and returns a 500 response. You can customise this with onError:
monitor({
limits: { d1Writes: 500 },
onError: (error, handler) => {
if (error instanceof RequestBudgetExceededError) {
return new Response('Operation too large', { status: 429 });
}
},
fetch: handler,
});
Daily budgets (Layer 2)
The hourly cron (0 * * * *) checks accumulated daily usage against configured budget limits.
How it works
- Each
monitor()invocation accumulates metrics in KV (budget:usage:daily:{feature}:{date}) - The hourly
budget-checkcron reads these counters and compares against limits - Alerts fire at configurable thresholds:
| Threshold | Action |
|---|---|
| 70% | Slack warning (deduplicated for 1 hour) |
| 90% | Slack critical warning (deduplicated for 1 hour) |
| 100% | Circuit breaker trips — feature returns 503 until TTL expires |
Configuration
Set budgets in cf-monitor.yaml:
budgets:
daily:
d1_writes: 50000
kv_writes: 10000
Or push from config to KV:
npx cf-monitor config sync
Auto-seeding (plan-aware)
If not configured, cf-monitor auto-seeds defaults based on your detected CF plan:
- Workers Paid: ~80% of monthly included / 30 days (e.g.
d1_writes: 1,333,333/day) - Workers Free: Much lower limits (e.g.
d1_writes: 10,000/day)
Plan detection uses the CF Subscriptions API. If your token lacks Account Settings: Read permission, it defaults to Paid plan limits (safe, conservative).
The auto-seeding runs during the first hourly budget check when no budget:config:* keys exist in KV. It discovers active features from usage data, writes per-feature configs with 25-hour TTL, and creates an __account__ fallback that applies to any feature without its own config. A seed flag (24-hour TTL) prevents re-seeding every hour.
If you run npx cf-monitor config sync with your own budgets, they take permanent precedence over auto-seeded defaults.
Monthly budgets (Layer 2b)
Monthly budgets work identically to daily but use a budget:usage:monthly:{feature}:{key} counter and budget:config:monthly:{feature} KV keys. Monthly alerts are deduplicated for 24 hours.
Billing period alignment
Monthly budgets track usage against your actual CF billing period (e.g. 2nd to 2nd), not calendar months. This prevents the ~2 day misalignment at period boundaries that could cause under- or over-counting.
The billing period is automatically detected from the CF Subscriptions API and cached in KV for 32 days. Monthly KV keys use the billing period start date (YYYY-MM-DD format, e.g. 2026-03-02) instead of calendar month (YYYY-MM).
If billing period detection is unavailable (token lacks permissions), monthly budgets fall back to calendar month boundaries (previous behaviour). During the transition from v0.2.x, both key formats are checked and summed — no data is lost.
Circuit breakers (Layer 3)
Circuit breakers are the “big red button”. When a budget is exceeded, the feature’s CB is tripped and all subsequent requests return 503 until the TTL expires.
Three levels
| Level | KV Key | Scope | Use case |
|---|---|---|---|
| Feature | cb:v1:feature:{featureId} | Single feature/route | Budget exceeded for one endpoint |
| Account | cb:v1:account | Entire account | Account-wide emergency |
| Global | cb:v1:global | Everything | Last resort kill switch |
Check order: global > account > feature. If global is tripped, nothing runs.
CB states
| Value | Meaning |
|---|---|
STOP | Feature is blocked — requests return 503 |
GO | Feature is explicitly allowed (reset with short TTL) |
| Not set | Feature is allowed (normal state) |
Auto-reset
Circuit breakers reset automatically when their KV TTL expires (default: 1 hour). This prevents a temporary spike from permanently disabling a feature.
Fast propagation
When a CB is reset, cf-monitor writes 'GO' with a 60-second TTL instead of deleting the key. This forces KV cache invalidation across Cloudflare’s edge network, which is faster than waiting for a delete to propagate (up to 60 seconds of eventual consistency).
Custom CB response
monitor({
onCircuitBreaker: (err) => {
// err.featureId — which feature was blocked
// err.level — 'feature', 'account', or 'global'
// err.reason — why it was tripped
return new Response('Service temporarily unavailable', { status: 503 });
},
fetch: handler,
});
Cost spike detection (Layer 4)
The 15-minute cron (*/15 * * * *) compares current hourly costs against a 24-hour baseline. If any metric exceeds the configured threshold (default: 200%), a Slack alert is sent.
This catches anomalies that fall within budget limits but are still unusual — like a worker suddenly doing 10x more D1 reads than normal.
Configure the threshold in cf-monitor.yaml:
monitoring:
spike_threshold: 2.0 # 200% of baseline (default)
Synthetic health checks (Validation layer)
Every hour, cf-monitor runs a synthetic health check that validates the entire CB pipeline:
- Trip a test circuit breaker (
platform:test:synthetic-cb) - Verify it blocks (reads
STOP) - Reset the circuit breaker
- Verify it passes (reads
GOor null)
If any step fails, it means the CB pipeline is broken and you’d find out before a real budget event.
Admin endpoints
For testing and emergency control:
| Endpoint | Purpose |
|---|---|
POST /admin/cb/trip | Trip a feature CB: { "featureId": "...", "ttlSeconds": 300 } |
POST /admin/cb/reset | Reset a feature CB: { "featureId": "..." } |
POST /admin/cb/account | Set account CB: { "status": "paused" } or { "status": "clear" } |
POST /admin/cron/budget-check | Manually trigger budget enforcement |
POST /admin/cron/cost-spike | Manually trigger cost spike detection |
POST /admin/cron/synthetic-health | Manually trigger CB health check |