Error Collection (CF Monitor)

cf-monitor captures errors from all workers on your account via Cloudflare's tail worker mechanism, deduplicates them, and optionally creates GitHub issues w...

cf-monitor captures errors from all workers on your account via Cloudflare’s tail worker mechanism, deduplicates them, and optionally creates GitHub issues with priority labels.

Prerequisites — minimum viable setup

To get errors flowing into GitHub issues, you need all four of the following. If any is missing, errors are still captured (visible via GET /errors) but no issues are created.

#	Requirement	How to satisfy
1	cf-monitor worker deployed on the same CF account as your workers	`npx cf-monitor init --account-id YOUR_ACCOUNT_ID` + `npx cf-monitor deploy`
2	Each monitored worker’s wrangler config has `"tail_consumers": [{ "service": "cf-monitor" }]`	`npx cf-monitor wire --apply` (this is the subscription that makes the tail stream reach cf-monitor)
3	`GITHUB_REPO` set (as an `init --github-repo` flag or in `cf-monitor.yaml` under `github.repo`)	`npx cf-monitor init --github-repo owner/repo` or edit `cf-monitor.yaml` and redeploy
4	`GITHUB_TOKEN` secret on the cf-monitor worker — PAT with `issues: write` (fine-grained) or `repo` (classic)	`npx cf-monitor secret set GITHUB_TOKEN`

Optional but recommended:

GITHUB_WEBHOOK_SECRET + a webhook on the target repo → enables bidirectional sync (close an issue on GitHub and cf-monitor stops re-creating it). See GitHub webhooks.
SLACK_WEBHOOK_URL → also get Slack alerts for P0/P1 errors.

How it works

Your worker throws an error (or a cron/queue handler fails)
Cloudflare delivers a tail event to cf-monitor’s tail() handler
cf-monitor extracts the error details, computes a fingerprint, and checks for duplicates
If it’s a new error, cf-monitor creates a GitHub issue (if configured) and writes the fingerprint to KV
If it’s a duplicate, the event is silently dropped

What gets captured

Error outcomes (from failed invocations)

Outcome	Priority	Description
`exception`	P1	Unhandled exception thrown by your code
`exceededCpu`	P0	Worker exceeded CPU time limit
`exceededMemory`	P0	Worker exceeded memory limit
`canceled`	P2	Request was cancelled (client disconnect)
`responseStreamDisconnected`	P3	Response stream terminated unexpectedly
`scriptNotFound`	P0	Worker script not found (deployment issue)

Soft errors (from successful invocations)

Log level	Priority	Description
`console.error()`	P2	Your code logged an error but returned a response
`console.warn()`	P4	Your code logged a warning — batched into daily digest

P0-P3 errors create individual GitHub issues immediately. P4 warnings are batched into a single daily digest issue at midnight UTC.

Practical consequence: any console.error() call in your worker code — even on an otherwise successful (outcome: ok) request — will produce an individual P2 GitHub issue. If you use console.error() for non-urgent logging, either downgrade those calls to console.warn() (which batches to the P4 daily digest) or route them elsewhere. A chatty console.error log can quickly hit the 10 issues/script/hour rate limit.

Fingerprinting

Every error is assigned a fingerprint — a stable 8-character hex hash based on:

fingerprint = FNV-hash( scriptName + ":" + outcome + ":" + normalise(message) )

Message normalisation

Before hashing, error messages are normalised to strip variable content:

UUIDs → <UUID> (e.g. abc123de-f456-7890-1234-56780cdef012)
JSON message extraction — if the message starts with {, the "message" field is extracted for fingerprinting (strips variable metadata like correlationId, duration_ms)
Hex IDs (8+ chars) → <ID> (catches correlationIds, MongoDB ObjectIds, etc.)
Numeric IDs (4+ digits) → <N>
Timestamps → <TS> (ISO 8601 format — processed before numeric IDs to prevent year stripping)
JSON small numbers (1-3 digits after :) → <N> (catches "attempt": 4, "totalAttempts": 2)
IP addresses → <IP>
Whitespace is collapsed and the message is truncated to 200 characters

This means the same logical error with different IDs, timestamps, or request-specific data produces the same fingerprint — so you get one GitHub issue, not thousands.

Deduplication layers

cf-monitor uses six layers to prevent duplicate issues:

Batch dedup — in-memory Set<string> tracks fingerprints within a single tail invocation. Eliminates same-batch duplicates with zero latency (no KV round-trip).
Fingerprint lookup — checks KV for err:fp:{hash}. If found, the error already has a GitHub issue. (90-day TTL)
Rate limit — max 10 issues per script per hour via err:rate:{script}:{hour}. Prevents issue floods from cascading failures.
Daily cap — max 50 issues per script per day via err:rate:{script}:daily:{date}. Hard ceiling on sustained floods.
Transient dedup — known transient errors (rate limits, timeouts, billing errors) get max 1 issue per category per day via err:transient:{script}:{category}:{date}. Works for both hard errors and soft errors (console.error()).
Lock — 60-second KV lock via err:lock:{fingerprint} prevents race conditions across concurrent tail invocations.

Transient error patterns

cf-monitor recognises 9 built-in transient patterns and limits them to one issue per category per day:

Pattern	Matches
`rate-limited`	”rate limit”, “429”, “too many requests”
`timeout`	”timeout”, “timed out”, “ETIMEDOUT”
`quota-exhausted`	”quota”, “exceeded limit”, “billing”
`connection-refused`	”ECONNREFUSED”, “connection refused”
`dns-failure`	”ENOTFOUND”, “DNS failed”, “getaddrinfo”
`service-unavailable`	”503”, “502”, canceled outcome, stream disconnected
`cf-internal`	”internal error” + “cloudflare”
`billing-exhausted`	”insufficient balance”, “402”, “payment required”

Custom transient patterns

cf-monitor.yaml accepts a transient_patterns: array for your own categories:

transient_patterns:
  - name: "custom-gateway-timeout"
    match: "504 gateway timeout"
  - name: "db-pool-exhausted"
    match: "connection pool"

Custom patterns are checked after the built-in patterns. Each requires a name (used in dedup keys) and match (regex, case-insensitive). Matching errors get the same 1-issue-per-day dedup as built-ins.

GitHub issues

When a new error is captured and GitHub is configured, cf-monitor creates an issue with:

Title: [P1] worker-name: exception
Body: markdown table with worker, outcome, priority, account, fingerprint, error message, and timestamp
Labels: cf:error:exception, cf:priority:p1, optionally cf:transient

Issue labels

Label	Meaning
`cf:error:{outcome}`	Error type (exception, exceededCpu, etc.)
`cf:priority:{p0-p4}`	Priority level
`cf:transient`	Known transient error (rate limit, timeout, etc.)
`cf:digest`	Part of daily warning digest
`cf:muted`	Manually muted — cf-monitor won’t re-create if closed

GitHub webhook sync

If you configure a GitHub webhook pointing to cf-monitor’s /webhooks/github endpoint, issue lifecycle events are synced bidirectionally:

GitHub action	cf-monitor effect
Issue closed	Fingerprint removed from KV — allows re-creation if the error recurs
Issue reopened	Fingerprint restored to KV — suppresses duplicate creation
Label `cf:muted` added	Fingerprint stored as muted — error won’t create new issues

This means you can manage error tracking through GitHub’s normal issue workflow.

Warning digest

P4 warnings (console.warn()) are not urgent enough for individual issues. Instead, cf-monitor batches them into KV throughout the day and creates a single daily digest GitHub issue at midnight UTC.

The digest groups warnings by worker and includes up to 20 entries per worker. The KV digest key (warn:digest:{date}) auto-expires after 48 hours.

Testing error collection

Use the dry-run endpoint to test GitHub issue formatting without creating real issues:

curl -X POST https://cf-monitor.YOUR_SUBDOMAIN.workers.dev/admin/test/github-dry-run \
  -H "Authorization: Bearer YOUR_ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"scriptName":"my-worker","outcome":"exception","errorMessage":"Connection timeout"}'

This returns the exact title, body, labels, fingerprint, and priority that would be used for a real issue.

Error Collection

Prerequisites — minimum viable setup#

How it works#

What gets captured#

Error outcomes (from failed invocations)#

Soft errors (from successful invocations)#

Fingerprinting#

Message normalisation#

Deduplication layers#

Transient error patterns#

Custom transient patterns#

GitHub issues#

Issue labels#

GitHub webhook sync#

Warning digest#

Testing error collection#

Related Articles