Troubleshooting
Common issues and their solutions when using cf-monitor.
Common issues and their solutions when using cf-monitor.
Monitor worker not receiving tail events
Symptoms: No errors appearing in GET /errors, no fingerprints in KV.
Causes and fixes:
-
Missing tail_consumers — check your worker’s wrangler config includes
"tail_consumers": [{ "service": "cf-monitor" }]. Runnpx cf-monitor wireto verify. -
Propagation delay — after deploying a new worker or changing
tail_consumers, Cloudflare takes 30-60 seconds to activate the tail binding. Wait a minute and test again. -
Monitor worker not deployed — run
npx cf-monitor statusto check if the monitor worker is healthy. -
Worker name mismatch —
tail_consumersreferences the monitor worker by name. Ensure it matches thenamefield in the monitor worker’s wrangler config (default:cf-monitor).
No metrics in Analytics Engine
Symptoms: npx cf-monitor status works but AE SQL queries return no data.
Causes and fixes:
-
AE write propagation — Analytics Engine writes take 30-90 seconds to become queryable. This is a platform limitation, not a bug.
-
Missing CF_MONITOR_AE binding — check your worker’s wrangler config includes the
analytics_engine_datasetsbinding. The binding name must beCF_MONITOR_AE. -
No traffic — AE data is only written when your worker handles requests. Hit your worker and wait 60 seconds.
-
Zero metrics — if all binding operations return zero (e.g. no D1 calls), the SDK skips the AE write to save cost. This is by design.
Circuit breaker won’t reset
Symptoms: Worker returns 503 even after waiting for TTL to expire.
Causes and fixes:
-
KV edge propagation — KV TTL expiration can take up to 60 seconds to propagate across Cloudflare’s edge. Wait a full minute after expected expiry.
-
Manual reset — force a reset via the admin endpoint:
curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cb/reset \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"featureId": "your-feature-id"}' -
Monthly budget also tripped — daily budgets reset via TTL, but if the monthly budget is also exceeded, the CB will be re-tripped on the next hourly check. Increase the monthly budget or wait for the month to roll over.
-
Account-level CB — check if the account CB is active:
curl https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/statusClear it with:
curl -X POST .../admin/cb/account \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"status":"clear"}'
CLI init fails
Symptoms: npx cf-monitor init errors out.
Causes and fixes:
-
Missing API token — set
CLOUDFLARE_API_TOKENin your environment or pass--api-token. -
Wrong account ID — the account ID is a 32-character hex string. Find it in the Cloudflare dashboard under Account Home > Account ID (right sidebar).
-
Insufficient permissions — the API token needs: Workers KV Storage (Edit), Account Analytics (Read), Workers Scripts (Edit).
-
Network issues — the CLI makes API calls to
api.cloudflare.com. Ensure you’re not behind a proxy that blocks these.
Worker name shows as ‘worker’
Symptoms: feature IDs all start with worker: instead of your actual worker name.
Causes and fixes:
-
WORKER_NAME not set — run
npx cf-monitor wire --applyto automatically injectWORKER_NAMEfrom your wrangler config’snamefield. -
Manual fix — add to your wrangler config:
{ "vars": { "WORKER_NAME": "my-worker-name" } } -
SDK override — set
workerNamein the monitor config:monitor({ workerName: 'my-worker', fetch: handler });
Detection chain: config.workerName > env.WORKER_NAME > env.name > 'worker'
GitHub issues not being created
Symptoms: Errors are captured (visible in GET /errors) but no GitHub issues are created in the repo.
Causes and fixes:
-
Missing
GITHUB_TOKENsecret — set it vianpx cf-monitor secret set GITHUB_TOKEN. This must be a GitHub PAT withrepoorissues:writescope. -
Missing
GITHUB_REPOvar or config — check that either:GITHUB_REPOis set in.cf-monitor/wrangler.jsoncvars, ORCF_MONITOR_CONFIGis set (automatically embedded since v0.3.6 when--github-repois passed toinitorcf-monitor.yamlhasgithub.repoconfigured)
-
cf-monitor.yamlnot re-embedded — if you addedgithub.repotocf-monitor.yamlafter initial deploy, runnpx cf-monitor deployto re-embed the config. -
Rate limited — cf-monitor limits to 10 issues per script per hour. Check
GET /errorsfor rate limit entries. -
Deduplication — if the same error fingerprint already has a GitHub issue, cf-monitor won’t create a duplicate. Check KV key
err:fp:{fingerprint}.
Verify: Run npx cf-monitor status — the response shows whether GitHub is configured.
Budget enforcement not working
Symptoms: usage accumulates in KV (budget:usage:daily:* keys) but no circuit breakers trip and no Slack warnings appear.
Causes and fixes:
-
No budget config keys — check KV for
budget:config:*keys. If empty, the hourly budget-check cron will auto-seed defaults fromPAID_PLAN_DAILY_BUDGETSon the next run. Trigger it manually:curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cron/budget-check \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" -
Config-sync not run — if you set custom budgets in
cf-monitor.yaml, push them to KV:npx cf-monitor config sync -
Seed flag active — auto-seeding is prevented for 24 hours after the last seed (to avoid hourly KV writes). If you need to re-seed immediately, delete the flag:
wrangler kv key delete "budget:config:__seeded__" --namespace-id YOUR_KV_NAMESPACE_ID -
__account__fallback — even without per-feature configs, the__account__config applies to all features. If this is missing too, auto-seeding failed. Checkwrangler tail cf-monitorfor errors.
Budget warnings not appearing in Slack
Symptoms: budgets are being exceeded (CB trips visible) but no Slack messages.
Causes and fixes:
-
SLACK_WEBHOOK_URL not set — run
npx cf-monitor secret set SLACK_WEBHOOK_URLand paste your Slack incoming webhook URL. -
Deduplication — budget warnings are deduplicated for 1 hour (daily) or 24 hours (monthly). If you just resolved the issue and it triggered again, the alert may be suppressed.
-
Test the payload — verify Slack payload formatting:
curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/test/slack-dry-run \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"type":"budget-warning","featureId":"test","metric":"kv_reads","current":900,"limit":1000}'
GitHub issues not being created
Symptoms: errors are captured (fingerprints in KV) but no GitHub issues appear.
Causes and fixes:
-
GITHUB_REPO or GITHUB_TOKEN not set — both are required. Run:
npx cf-monitor secret set GITHUB_TOKENAnd ensure
github.repois set in cf-monitor.yaml. -
Token permissions — the token needs
reposcope (classic PAT) orissues: writepermission (fine-grained PAT). -
Rate limit — max 10 issues per script per hour. If you’ve triggered many errors quickly, wait for the rate window to pass.
-
Test the format — use the dry-run endpoint to see what would be created:
curl -X POST .../admin/test/github-dry-run \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"scriptName":"my-worker","outcome":"exception","errorMessage":"test error"}'
Feature IDs are wrong or unexpected
Symptoms: budget keys and AE data use unexpected feature IDs.
Causes and fixes:
-
Path normalisation — cf-monitor strips numeric segments (
/users/123becomesusers), UUIDs, and limits paths to 2 segments. This is intentional to prevent feature ID explosion. -
Explicit control — use the
featuresmap for routes that need specific IDs:monitor({ features: { 'POST /api/scan': 'scanner:social', 'GET /api/users/:id': 'api:users', }, fetch: handler, }); -
Single bucket — for simple workers, use
featureIdto put everything in one budget:monitor({ featureId: 'my-worker:all', fetch: handler });
Usage data shows “No usage data collected yet”
Symptoms: npx cf-monitor usage or GET /usage returns no data.
Causes and fixes:
-
First cron hasn’t run — account usage is collected hourly on the
0 * * * *schedule. Wait for the next hour, or trigger manually:curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cron/collect-account-usage \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" -
Missing CLOUDFLARE_API_TOKEN — the same API token used for worker discovery is used for GraphQL queries. Ensure it’s set as a secret on the cf-monitor worker.
-
No services in use — if your account has zero D1, KV, R2, etc. activity in the last 24 hours, the usage snapshot will show empty services. This is correct behaviour.
-
GraphQL API unavailable — the CF GraphQL Analytics API occasionally returns errors. Check the monitor worker’s logs for
[cf-monitor:usage]messages.
Plan shows as “paid” when account is actually free
Symptoms: GET /status or npx cf-monitor status shows plan: "paid" on a Workers Free account.
Causes and fixes:
-
Token lacks billing permission — plan detection requires the
Account Settings: Readpermission (#billing:read) on your API token. Without it, cf-monitor conservatively defaults to “paid” (which means higher budget limits — safe but less protective for free accounts). Add the permission to your token for accurate detection. -
Cached result — the detected plan is cached in KV for 24 hours. If you recently upgraded/downgraded your plan, wait for cache expiry or delete the
config:planKV key manually.
Debug endpoints
These endpoints are always available on the monitor worker for troubleshooting:
| Endpoint | What it tells you |
|---|---|
GET /_health | Is the monitor worker running? |
GET /status | Account health, plan, billing period, CB states, GitHub/Slack config |
GET /errors | Recent error fingerprints and their GitHub issue URLs |
GET /budgets | Active circuit breakers, billing period |
GET /workers | Which workers have been discovered on the account |
GET /plan | Detected plan type, billing period, days remaining, plan allowances |
GET /usage | Account-wide per-service usage from CF GraphQL (approximate) |
GET /self-health | Self-monitoring: stale crons, error counts, handler breakdown |
Admin endpoints returning 401
Symptoms: All POST /admin/* requests return {"error":"Unauthorized"}.
Causes and fixes:
-
ADMIN_TOKEN not set — set the secret on the cf-monitor worker:
openssl rand -hex 32 # Generate a token npx cf-monitor secret set ADMIN_TOKEN -
Missing Authorization header — admin requests require:
curl -X POST .../admin/cron/budget-check \ -H "Authorization: Bearer YOUR_ADMIN_TOKEN" -
Wrong token — ensure the token in the header matches what was set via
secret set. Tokens are case-sensitive. -
Missing “Bearer ” prefix — the header must be
Authorization: Bearer <token>, notAuthorization: <token>.
See Security — Admin endpoint authentication for details.
Self-monitoring shows stale crons
Symptoms: GET /self-health returns 503 with staleCrons listing one or more handlers.
Causes and fixes:
-
Cron recently deployed — after first deploy, it may take up to the cron interval (15 min or 1 hour) for all cron handlers to run once. Wait for the next scheduled execution.
-
Worker not running — check
npx cf-monitor statusandwrangler tail cf-monitorfor errors. -
KV propagation — self-monitoring timestamps are stored in KV with 48-hour TTL. Edge cache inconsistency may briefly show stale data.
-
Actual failure — if a specific cron handler consistently appears stale, check
wrangler tail cf-monitorfor errors during that handler’s schedule. Common causes: API token expired, GitHub rate limit, Slack webhook revoked.