Troubleshooting

Common issues and their solutions when using cf-monitor.

Monitor worker not receiving tail events

Symptoms: No errors appearing in GET /errors, no fingerprints in KV.

Causes and fixes:

Missing tail_consumers — check your worker’s wrangler config includes "tail_consumers": [{ "service": "cf-monitor" }]. Run npx cf-monitor wire to verify.
Propagation delay — after deploying a new worker or changing tail_consumers, Cloudflare takes 30-60 seconds to activate the tail binding. Wait a minute and test again.
Monitor worker not deployed — run npx cf-monitor status to check if the monitor worker is healthy.
Worker name mismatch — tail_consumers references the monitor worker by name. Ensure it matches the name field in the monitor worker’s wrangler config (default: cf-monitor).

No metrics in Analytics Engine

Symptoms: npx cf-monitor status works but AE SQL queries return no data.

Causes and fixes:

AE write propagation — Analytics Engine writes take 30-90 seconds to become queryable. This is a platform limitation, not a bug.
Missing CF_MONITOR_AE binding — check your worker’s wrangler config includes the analytics_engine_datasets binding. The binding name must be CF_MONITOR_AE.
No traffic — AE data is only written when your worker handles requests. Hit your worker and wait 60 seconds.
Zero metrics — if all binding operations return zero (e.g. no D1 calls), the SDK skips the AE write to save cost. This is by design.

Circuit breaker won’t reset

Symptoms: Worker returns 503 even after waiting for TTL to expire.

Causes and fixes:

KV edge propagation — KV TTL expiration can take up to 60 seconds to propagate across Cloudflare’s edge. Wait a full minute after expected expiry.

Manual reset — force a reset via the admin endpoint:

curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cb/reset \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"featureId": "your-feature-id"}'

Monthly budget also tripped — daily budgets reset via TTL, but if the monthly budget is also exceeded, the CB will be re-tripped on the next hourly check. Increase the monthly budget or wait for the month to roll over.

Account-level CB — check if the account CB is active:

curl https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/status

Clear it with:

curl -X POST .../admin/cb/account \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status":"clear"}'

CLI init fails

Symptoms: npx cf-monitor init errors out.

Causes and fixes:

Missing API token — set CLOUDFLARE_API_TOKEN in your environment or pass --api-token.
Wrong account ID — the account ID is a 32-character hex string. Find it in the Cloudflare dashboard under Account Home > Account ID (right sidebar).
Insufficient permissions — the API token needs: Workers KV Storage (Edit), Account Analytics (Read), Workers Scripts (Edit).
Network issues — the CLI makes API calls to api.cloudflare.com. Ensure you’re not behind a proxy that blocks these.

Worker name shows as ‘worker’

Symptoms: feature IDs all start with worker: instead of your actual worker name.

Causes and fixes:

WORKER_NAME not set — run npx cf-monitor wire --apply to automatically inject WORKER_NAME from your wrangler config’s name field.

Manual fix — add to your wrangler config:

{ "vars": { "WORKER_NAME": "my-worker-name" } }

SDK override — set workerName in the monitor config:

monitor({ workerName: 'my-worker', fetch: handler });

Detection chain: config.workerName > env.WORKER_NAME > env.name > 'worker'

GitHub issues not being created

Symptoms: Errors are captured (visible in GET /errors) but no GitHub issues are created in the repo.

Causes and fixes:

Missing GITHUB_TOKEN secret — set it via npx cf-monitor secret set GITHUB_TOKEN. This must be a GitHub PAT with repo or issues:write scope.
Missing GITHUB_REPO var or config — check that either:
- GITHUB_REPO is set in .cf-monitor/wrangler.jsonc vars, OR
- CF_MONITOR_CONFIG is set (automatically embedded since v0.3.6 when --github-repo is passed to init or cf-monitor.yaml has github.repo configured)
cf-monitor.yaml not re-embedded — if you added github.repo to cf-monitor.yaml after initial deploy, run npx cf-monitor deploy to re-embed the config.
Rate limited — cf-monitor limits to 10 issues per script per hour. Check GET /errors for rate limit entries.
Deduplication — if the same error fingerprint already has a GitHub issue, cf-monitor won’t create a duplicate. Check KV key err:fp:{fingerprint}.

Verify: Run npx cf-monitor status — the response shows whether GitHub is configured.

Budget enforcement not working

Symptoms: usage accumulates in KV (budget:usage:daily:* keys) but no circuit breakers trip and no Slack warnings appear.

Causes and fixes:

No budget config keys — check KV for budget:config:* keys. If empty, the hourly budget-check cron will auto-seed defaults from PAID_PLAN_DAILY_BUDGETS on the next run. Trigger it manually:
```
curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cron/budget-check \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN"
```
Config-sync not run — if you set custom budgets in cf-monitor.yaml, push them to KV:
```
npx cf-monitor config sync
```
Seed flag active — auto-seeding is prevented for 24 hours after the last seed (to avoid hourly KV writes). If you need to re-seed immediately, delete the flag:
```
wrangler kv key delete "budget:config:__seeded__" --namespace-id YOUR_KV_NAMESPACE_ID
```
__account__ fallback — even without per-feature configs, the __account__ config applies to all features. If this is missing too, auto-seeding failed. Check wrangler tail cf-monitor for errors.

Budget warnings not appearing in Slack

Symptoms: budgets are being exceeded (CB trips visible) but no Slack messages.

Causes and fixes:

SLACK_WEBHOOK_URL not set — run npx cf-monitor secret set SLACK_WEBHOOK_URL and paste your Slack incoming webhook URL.
Deduplication — budget warnings are deduplicated for 1 hour (daily) or 24 hours (monthly). If you just resolved the issue and it triggered again, the alert may be suppressed.

Test the payload — verify Slack payload formatting:

curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/test/slack-dry-run \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"type":"budget-warning","featureId":"test","metric":"kv_reads","current":900,"limit":1000}'

GitHub issues not being created

Symptoms: errors are captured (fingerprints in KV) but no GitHub issues appear.

Causes and fixes:

GITHUB_REPO or GITHUB_TOKEN not set — both are required. Run:
```
npx cf-monitor secret set GITHUB_TOKEN
```
And ensure github.repo is set in cf-monitor.yaml.
Token permissions — the token needs repo scope (classic PAT) or issues: write permission (fine-grained PAT).
Rate limit — max 10 issues per script per hour. If you’ve triggered many errors quickly, wait for the rate window to pass.

Test the format — use the dry-run endpoint to see what would be created:

curl -X POST .../admin/test/github-dry-run \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"scriptName":"my-worker","outcome":"exception","errorMessage":"test error"}'

Feature IDs are wrong or unexpected

Symptoms: budget keys and AE data use unexpected feature IDs.

Causes and fixes:

Path normalisation — cf-monitor strips numeric segments (/users/123 becomes users), UUIDs, and limits paths to 2 segments. This is intentional to prevent feature ID explosion.

Explicit control — use the features map for routes that need specific IDs:

monitor({
  features: {
    'POST /api/scan': 'scanner:social',
    'GET /api/users/:id': 'api:users',
  },
  fetch: handler,
});

Single bucket — for simple workers, use featureId to put everything in one budget:
```
monitor({ featureId: 'my-worker:all', fetch: handler });
```

Usage data shows “No usage data collected yet”

Symptoms: npx cf-monitor usage or GET /usage returns no data.

Causes and fixes:

First cron hasn’t run — account usage is collected hourly on the 0 * * * * schedule. Wait for the next hour, or trigger manually:

curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/cron/collect-account-usage \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN"

Missing CLOUDFLARE_API_TOKEN — the same API token used for worker discovery is used for GraphQL queries. Ensure it’s set as a secret on the cf-monitor worker.
No services in use — if your account has zero D1, KV, R2, etc. activity in the last 24 hours, the usage snapshot will show empty services. This is correct behaviour.
GraphQL API unavailable — the CF GraphQL Analytics API occasionally returns errors. Check the monitor worker’s logs for [cf-monitor:usage] messages.

Plan shows as “paid” when account is actually free

Symptoms: GET /status or npx cf-monitor status shows plan: "paid" on a Workers Free account.

Causes and fixes:

Token lacks billing permission — plan detection requires the Account Settings: Read permission (#billing:read) on your API token. Without it, cf-monitor conservatively defaults to “paid” (which means higher budget limits — safe but less protective for free accounts). Add the permission to your token for accurate detection.
Cached result — the detected plan is cached in KV for 24 hours. If you recently upgraded/downgraded your plan, wait for cache expiry or delete the config:plan KV key manually.

Debug endpoints

These endpoints are always available on the monitor worker for troubleshooting:

Endpoint	What it tells you
`GET /_health`	Is the monitor worker running?
`GET /status`	Account health, plan, billing period, CB states, GitHub/Slack config
`GET /errors`	Recent error fingerprints and their GitHub issue URLs
`GET /budgets`	Active circuit breakers, billing period
`GET /workers`	Which workers have been discovered on the account
`GET /plan`	Detected plan type, billing period, days remaining, plan allowances
`GET /usage`	Account-wide per-service usage from CF GraphQL (approximate)
`GET /self-health`	Self-monitoring: stale crons, error counts, handler breakdown

Admin endpoints returning 401

Symptoms: All POST /admin/* requests return {"error":"Unauthorized"}.

Causes and fixes:

ADMIN_TOKEN not set — set the secret on the cf-monitor worker:

openssl rand -hex 32   # Generate a token
npx cf-monitor secret set ADMIN_TOKEN

Missing Authorization header — admin requests require:

curl -X POST .../admin/cron/budget-check \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN"

Wrong token — ensure the token in the header matches what was set via secret set. Tokens are case-sensitive.
Missing “Bearer ” prefix — the header must be Authorization: Bearer <token>, not Authorization: <token>.

See Security — Admin endpoint authentication for details.

Self-monitoring shows stale crons

Symptoms: GET /self-health returns 503 with staleCrons listing one or more handlers.

Causes and fixes:

Cron recently deployed — after first deploy, it may take up to the cron interval (15 min or 1 hour) for all cron handlers to run once. Wait for the next scheduled execution.
Worker not running — check npx cf-monitor status and wrangler tail cf-monitor for errors.
KV propagation — self-monitoring timestamps are stored in KV with 48-hour TTL. Edge cache inconsistency may briefly show stale data.
Actual failure — if a specific cron handler consistently appears stale, check wrangler tail cf-monitor for errors during that handler’s schedule. Common causes: API token expired, GitHub rate limit, Slack webhook revoked.

Monitor worker not receiving tail events

No metrics in Analytics Engine

Circuit breaker won’t reset

CLI init fails

Worker name shows as ‘worker’

GitHub issues not being created

Budget enforcement not working

Budget warnings not appearing in Slack

GitHub issues not being created

Feature IDs are wrong or unexpected

Usage data shows “No usage data collected yet”

Plan shows as “paid” when account is actually free

Debug endpoints

Admin endpoints returning 401

Self-monitoring shows stale crons

Related Articles