changelog
- feat: new /health command — live system + triggers + cost snapshot (v1). Consolidates RAM / swap (/proc/meminfo), the Untether process self-diagnostic (PID...
v0.35.2 (2026-04-20)
changes
- feat: new
/healthcommand — live system + triggers + cost snapshot (v1). Consolidates RAM / swap (/proc/meminfo), the Untether process self-diagnostic (PID, RSS, FDs, children — reusesproc_diag.collect_proc_diag), trigger counts (cron/webhook IDs viaTriggerManager), today’s API cost (cost_tracker.get_daily_cost), and uptime (reuses/ping’s_STARTED_ATso there’s only one monotonic counter) into a compact Telegram HTML message. Each section degrades gracefully — unavailable data sources (non-Linux, no trigger_manager, no cost tracker) show a fallback or omit rather than erroring out. New filesrc/untether/telegram/commands/health.py(~180 LOC), 12 unit tests intests/test_health_command.py, entry point registered inpyproject.toml. v1 scope only — v2 extras noted in the issue (/health --subtreetree walk,/health --costsper-project breakdown, workerd group detection, colour-coded warning markers) are deferred to follow-ups #348 - fix: wedge detector (#322) no longer fires during legitimate background work. Claude Code v2.1.72+ primitives (
Monitor,Bash run_in_background=true,Agent run_in_background=true,ScheduleWakeup,RemoteTrigger) emitresultand then park the subprocess waiting for the primitive to complete —_detect_stuck_after_tool_resultinsrc/untether/runner_bridge.pypreviously couldn’t distinguish that from a real hang (samelast_event_type=assistant, same frozen ring buffer, same CPU-active state). Now uses the tracking infrastructure from #347: duck-types againststream.engine_state.has_live_background_work()and returnsFalse(suppressed) when any primitive’s deadline is still in the future. Engines that don’t expose an engine_state (Codex, OpenCode, Pi, Gemini, AMP) see no behaviour change — the check no-ops. NewJsonlStreamState.engine_statefield (base class) carries the reference;ClaudeRunner.run_implpopulates it after creating both states. Newprogress_edits.stuck_after_tool_result.suppressedstructlog INFO entry fires when the gate kicks in, so staging greps can tell “we skipped detection because Monitor was armed” apart from “detection didn’t trigger”. Four new tests intests/test_exec_bridge.py::TestStuckAfterToolResultDetectorcover the monitor-armed / monitor-expired / engine_state-absent / bg_bash-active cases #346 - feat: per-session tracking of Claude Code’s long-running background primitives —
Monitor,Bash run_in_background=true,Agent run_in_background=true,ScheduleWakeup,RemoteTrigger(v1 infrastructure).ClaudeStreamStateinsrc/untether/runners/claude.pygains five new collections (live_monitors: dict[str, float],live_bg_bashes: set[str],live_bg_agents: set[str],live_wakeups: dict[str, float],live_remote_triggers: set[str]) keyed by tool_use_id. New_register_background_handle()called from theStreamToolUseBlockbranch oftranslate_claude_eventparses the tool name +inputpayload (extractingtimeout_ms/delay_msfor deadline tracking); new_clear_background_handle()called from thetool_resultbranch removes the entry on explicit completion. New public helpershas_live_background_work()(gates #346’s wedge detector) andbackground_task_summary()(future footer rendering) complete the API surface. This PR is purely telemetry — no footer rendering, no/backgroundcommand, no control-channel hooks — those are v2 and will be filed as follow-ups once meta-threading throughProgressTrackeris confirmed safe for the other 5 engines. 11 new unit tests cover tool_use parsing for each of the 5 primitives, tool_result clearing, thehas_live_background_workdeadline-aware gate, andbackground_task_summarypluralisation #347 - feat: pre-spawn RAM guard refuses or warns when spawning a new engine subprocess on a near-OOM host. When a parallel heavy run (e.g. vitest-pool-workers with 100+ workerd children) has already consumed most available RAM, the guard prevents doomed Node startup failures that would otherwise leak memory to other chats via OOM-kill side effects. New
mem_available_kb()helper insrc/untether/utils/proc_diag.pyreads/proc/meminfowithout caching; newWatchdogSettings.prespawn_ram_warn_mb(default 2000) andprespawn_ram_block_mb(default 500) insrc/untether/settings.pyplus amodel_validatorthat rejects configurations wherewarn <= block(would make the warn tier unreachable); either tier set to0disables that tier and0 / 0disables the guard entirely. NewJsonlSubprocessRunner._check_prespawn_ram_guard()insrc/untether/runner.pyruns BEFOREmanage_subprocessso a blocked spawn costs nothing — on BLOCK, yieldsCompletedEvent(ok=False, error="🛑 Insufficient RAM …")and returns early without forking; on WARN, logssubprocess.prespawn.ram_warningstructured entry. Eight new unit tests cover the meminfo parser, the validator ordering rule, and the runner guard’s ALLOW/WARN/BLOCK/DISABLED branches. Works downstream ofOOMScoreAdjust=-100(#275’s Layer 1) by preventing the OOM scenario from arising in the first place #350 - feat: surface Claude
rate_limit_eventas a visible progress note instead of silent inactivity. When Anthropic throttles the API, Claude Code emits arate_limit_eventJSONL message; the runner previously returned an empty list for this event kind so the user saw no feedback on Telegram — the session appeared to hang or, if they hit/cancel, disappear without a cost footer.translate_claude_eventinsrc/untether/runners/claude.pynow emits anote-kindActionEventpair (started + completed) rendered as⏳ Rate limited — retrying in Xs, withretry_after_ms,tokens_remaining, andrequests_remainingexposed via the actiondetailfor downstream consumers.ClaudeStreamStategainsrate_limit_total_s+rate_limit_countfields accruing across the session for future cost-footer annotation and/statssurfacing (deferred to a v2 follow-up). A newclaude.rate_limit_eventstructlog INFO line logsretry_after_s,count, andcumulative_sso staging greps can triage rate-limit-driven user reports. The existingtest_rate_limit_event_returns_empty(locked in the old silent behaviour) is re-scoped totest_rate_limit_event_decodes_correctly(schema tag only); three new tests cover visible-render, multi-throttle accumulation, and missing-retry-hint fallback #349 - feat: restart-required vs hot-reloadable settings are now structurally surfaced.
TelegramTransportSettings.RESTART_REQUIRED_FIELDS(newClassVar[frozenset[str]]insrc/untether/settings.py) is the single source of truth for which transport fields need a process restart (bot_token,chat_id,session_mode,topics,message_overflow);telegram/loop.py:handle_reload()now consumes that ClassVar instead of the previously-inlinedRESTART_ONLY_KEYSset. When a restart-required key changes during hot-reload, the bot now posts a 🔄 notice (“SettingXchanged — restart required to take effect; run:systemctl --user restart untether”) in addition to the existingconfig.reload.transport_config_changedstructlog warning, so the operator doesn’t silently run on stale values.docs/reference/config.mdgains a comprehensive “Hot-reload vs restart-required” section and per-field 🔄 markers in the transport / topics tables #318- follow-up:
_notify_restart_requiredbroadcasts to everyruntime.project_chat_ids()plus anyallowed_user_idsadmin DM instead of a singlecfg.chat_idsend — in project-routed deploymentscfg.chat_idis the placeholder sentinel and every send failed withchat not found, so the user-visible warning never arrived. Per-chat failures are logged viaconfig.reload.restart_notify.failedand skipped;config.reload.restart_notify.sentemitstargets+sent_countfor observability. Falls back tocfg.chat_idonly when no routed targets exist.
- follow-up:
- feat:
[[triggers.crons]]now accepts an optionalpermission_modefield (default|plan|auto|acceptEdits|bypassPermissions) that overrides the chat / engine default for that cron’s run only. Crons firing into plan-mode chats can now declare themselves autonomous viapermission_mode = "auto"without flipping the whole chat to auto. Precedence: cronpermission_mode> per-chat/planmode> engine config default. Claude-only for this release; Codex + Gemini completion is tracked in #331, and the broader all-engines + webhooks extension in #332 (v0.35.5). NewVALID_PERMISSION_MODES_BY_ENGINEdict inrunners/run_options.pylets theCronConfigvalidator reject typos for engines with known value sets while staying forward-compatible for engines whose permission wiring is pending. A newtrigger.cron.permission_mode_overridestructlog INFO entry fires when the override actually changes the resolved value, for staging observability. #330 - callback-answer instrumentation for inline-keyboard presses — every
answerCallbackQuerynow emits acallback.answeredINFO event withlatency_ms(HTTP round-trip),total_ms(since dispatcher entry),early=true|false, andhas_toast. Lets staging greps distinguish “we were fast, Telegram was slow” from “we were slow” whenBotResponseTimeoutErroris reported client-side. Investigation of the existinganswer_earlypath confirmed it already fires before anybackend.handle()work; added a regression test (test_early_answer_fires_before_slow_handle) locking the ordering invariant in so future refactors can’t reintroduce the timeout window. Telegram-transport reference docs gained a callback-answering section with the structured-log schema and triage guidance #247 - feat:
xhigheffort level added for Claude Code (Opus 4.7, Claude Code CLI v2.1.114+)._ENGINE_REASONING_LEVELS["claude"]insrc/untether/telegram/engine_overrides.pygainsxhighbetweenhighandmax. Button scaffolding in/config > Reasoning, the"xhi"action-key, and descriptive help text already existed from #272’s Codex work, so this is a single-tuple edit plus docs/test refresh.test_reasoning_shows_claude_levelsupdated to assertconfig:rs:xhiis now present for Claude;docs/reference/runners/claude/runner.md--effortlist now readslow/medium/high/xhigh/max#351
fixes
- security: Claude and Pi engine subprocesses no longer inherit the parent’s full environment — only allowlisted variables (basic OS essentials, AI/cloud provider keys, Claude/MCP namespaces, Node/Python/UV/NPM runtime vars) pass through via the new
utils/env_policy.filtered_env()helper. Random third-party tokens that happen to live in the parent env (AWS, Stripe, DigitalOcean, DATABASE_URL, personal app tokens, etc.) are no longer available to engine subprocesses or their MCP servers — reduces the blast radius of any tool-call or MCP that exfiltrates process env. PR #323’s foursetdefaultreinforcements for the stuck-after-tool_result watchdog are preserved on top of the filtered env. Other engines (Codex, Gemini, OpenCode, AMP) keep the default inherit-everything behaviour for this release; extending to them is tracked as part of #332 (v0.35.5). Adding a new engine or MCP that relies on an unfamiliar variable is documented at the top ofutils/env_policy.py#198 - security: CI matrix values (
matrix.command,matrix.sync_args) now pass throughenv:instead of direct${{ }}interpolation inrun:blocks, eliminating a theoretical shell-injection vector should matrix values ever become dynamic (e.g. from PR labels) #195 - security:
bot_tokenis nowpydantic.SecretStrinTelegramTransportSettings— masks the value inrepr(),str(), tracebacks, and any accidental structlog serialisation. Raw value is unwrapped via.get_secret_value()at the transport boundary (require_telegram,backend.lock_token/build_and_run,cli/doctor,cli/onboarding_cmd). A field_validator preserves the pre-change NonEmptyStr contract (whitespace-only tokens still rejected, since SecretStr bypassesstr_strip_whitespace) #196 - security:
_HANDLED_REQUESTSinrunners/claude.pyswitched from asetcleared wholesale at 100 entries to an LRUOrderedDict(max 200, oldest-first eviction) — closes the small window where a duplicate Telegram callback delivered just after a.clear()would be misclassified as “request not found” rather than “duplicate” #197 - security: Codex auth subprocess output is now
html.escape()’d before being wrapped in<pre>in the HTML-mode Telegram reply — prevents a crafted error message from injecting Telegram entities (<b>,<a>, etc) into the rendered response #199 - security: voice transcription error paths (
telegram/voice.py) and command-dispatch error paths (telegram/commands/dispatch.py) now send sanitised text via sharedutils/error_display.user_safe_error()— strips URLs and absolute paths, caps length, and falls back when sanitised text is empty. Full exception detail still goes to structlog #200 #201 - security: removed global bandit skips for B603/B607 in
pyproject.toml; the three remaining subprocess sites (telegram/backend.py:_detect_cli_version,telegram/commands/usage.pymacOS Keychain lookup,utils/git.py:_run_git) are annotated inline with# nosec+ per-site justification — CI now flags any NEW subprocess call site by default #202 - security:
_EPHEMERAL_MSGSand_OUTLINE_REGISTRYinrunner_bridge.pygain companion timestamp maps and asweep_stale_registries()helper that prunes entries older than 1 hour. Sweep piggy-backs onProgressEdits._stall_monitor’s existing 60-second tick — handles runs that crash or exit abnormally without firing the normal cleanup path #203 - security:
telegram/client_api.py:download_filevalidatesfile_path(from TelegramgetFile) against://,.., and leading/before URL construction — a tampered or spoofed getFile response that returned an attacker-controlled URL asfile_pathcould otherwise redirect the subsequent HTTP GET away fromapi.telegram.org#204 - engine subprocess cleanup now walks the process tree and signals descendants in separate process groups — previously
os.killpg(proc.pid, SIGTERM)only reached the parent’s direct pgroup, so grandchildren spawned with fresh sessions (Node’schild_process.spawn()pattern, used byworkerdvia@cloudflare/vitest-pool-workers) survived a SIGTERM’d Claude Code session. On lba-1 this orphaned 316workerdprocesses consuming 37 GB of RAM after 6 cascading Claude Code signal deaths._signal_processnow snapshots descendants viaproc_diag.find_descendants()beforekillpg(so/proc/<pid>/task/*/childrenis still readable), runs the existing pgroup kill, thenos.kill(pid, sig)on each captured PID best-effort (swallowingProcessLookupError/PermissionError). SIGKILL escalation walks the tree again. Graceful fallback to legacy pgroup-only behaviour on non-Linux hosts or/procread errors. Related upstream: anthropics/claude-code#43944, cloudflare/workers-sdk#8837 #275proc_diag._find_descendantsrenamed to publicfind_descendants(private alias kept for back-compat with existing test imports)
- webhook server now degrades gracefully when it can’t bind its port — previously a port conflict (e.g. another process on the default 9876) crashed the entire bot (polling, commands, crons included) via an uncaught
OSErrorpropagating through theanyiotask group, triggering a systemd restart loop.run_webhook_servernow catchesOSErrorfromTCPSite.start(), logs a structuredtriggers.server.bind_failedevent withhost/port/hint/fixfields, and returns normally so the rest of the bot stays up #320 - cost footer accuracy and engine cost parity — 60-second TTL cache on the Claude subscription-usage fetch (
utils/usage_cache.py) with stale-while-error fallback smooths transient 429s and rate-limit windows; a one-shotclaude_usage.schema_mismatchwarning logs missing expected fields so upstream API drift is noticed instead of silently dropping the footer;_format_run_costnow renders zero-turn completions (if turns is not None:instead ofif turns:); Gemini runner extractsstats.total_cost_usdinto usage when present; AMPAmpResultschema gains atotal_cost_usdfield and the runner surfaces it through the usage dict when AMP emits one; added an OpenCode regression test locking in that token counts still render when cost is zero (free-tier runs) #316- persisting the daily cost accumulator across restarts was part of the issue’s “nice-to-have” scope and is deferred to a follow-up to keep this change focused on accuracy + parity
run_once = truecrons now persist their fired state torun_once_fired.json(sibling ofuntether.toml) — no longer re-fire on config hot-reload or process restart. Previously the TOML entry re-entered the active list on every reload becauseremove_cron()was in-memory only; editing any unrelated config setting would cause every already-fired one-shot to run again.TriggerManagernow takes an optionalconfig_pathargument, loads the fired set on init, persists onremove_cron(), and auto-cleans fired-state entries whose cron id no longer appears in the TOML so ids can be safely reused. Related: #269 (hot-reload), #294 (master pause toggle) #317- Pi footer now shows the model name when the user relies on the default config model (no
/model setoverride and nopi.modelinuntether.toml). The Pi CLI’smessage_endevent carries"model": "..."alongside provider/usage; the runner now extracts this and emits a supplementaryStartedEventonce per session soProgressTracker.note_eventmerges it into the tracker meta. Priority preserved:run_options.model>self.model> JSONL fallback. Completes the work begun in #235 #225- follow-up:
JsonlSubprocessRunner.handle_started_eventwas silently dropping the supplementaryStartedEventas a same-session duplicate, so the extracted model never reachedProgressTracker.note_event. The filter now emits duplicates through when the event carriesmeta; true duplicates (no meta) are still dropped. Unit tests intests/test_runner_utils.pypreviously passed because they calledtranslate_pi_eventdirectly, bypassing the base-runner filter — added a regression test covering the duplicate-with-meta path.
- follow-up:
- detect and recover from Claude Code hanging after an MCP
tool_resultvia stream-json / sdk-cli — root cause is upstream claude-code#39700 / #41086 combined with the undici idle-body timeout inmcp-remote(geelen/mcp-remote#226, #107) talking to Cloudflare’s remote MCP servers. The symptom “MCP tool may be hung: cloudflare-observability” was misleading — the MCP had already returned its result; the engine was silent after ingesting it #322- new engine-agnostic
_classify_jsonl_event()inrunner.pyrecognises tool_result-equivalent events across all six engines (Claude, Codex, OpenCode, Pi, Gemini, AMP);JsonlStreamStategains alast_tool_result_atlatch cleared only on an assistant-turn event - new
ProgressEdits._detect_stuck_after_tool_result()fires when the latch has been set for ≥stuck_after_tool_result_timeout(default 300 s, matches undici’s 5-minute idle-body timeout) withcpu_active=True, frozen ring buffer ≥ 3, and no pending approval — ExitPlanMode-, Bash-, and subagent-safe - tiered recovery in
ProgressEdits._handle_stuck_after_tool_result(): Tier 1 logsprogress_edits.stuck_after_tool_resultwith diag; Tier 2 SIGTERMs MCP-adapter children whose/proc/<pid>/cmdlinecontainsmcp-remoteor@modelcontextprotocol(forces the SSE reader to error out and unblocks the parent engine); Tier 3 cancels viacancel_eventafterstuck_after_tool_result_recovery_delay(default 60 s) with a specific Telegram notice runners/claude.py:env()now setsCLAUDE_ENABLE_STREAM_WATCHDOG=1,CLAUDE_STREAM_IDLE_TIMEOUT_MS=60000,MCP_TOOL_TIMEOUT=120000, andMAX_MCP_OUTPUT_TOKENS=12000viasetdefault— reduces incidence while the detector is the safety net; user overrides via shell env or~/.claude/settings.jsonstill win- four new
[watchdog]config fields:detect_stuck_after_tool_result(defaultfalsefor this release, will defaulttrueonce validated),stuck_after_tool_result_timeout,stuck_after_tool_result_recovery_enabled,stuck_after_tool_result_recovery_delay utils/proc_diag.py:read_cmdline()helper for identifying adapter children; 17 new tests across engine-matrix classifier, detector gates, and tier-1/2/3 state machine
- new engine-agnostic
- fix:
CLAUDE_STREAM_IDLE_TIMEOUT_MSdefault raised from 60000ms to 300000ms (5 min). PR #323’s original 60s reinforcement of #322 proved too aggressive foropus · maxreasoning — legitimate chain-of-thought expansion produces 60–120s SSE-idle windows between output deltas, tripping the upstream Claude CLI stream watchdog and aborting runs with “API Error: Stream idle timeout - partial response received” (observed on staging mid-reasoning withpeak_idle_seconds=91.4). 300000ms matches the undici idle-body timeout that motivated #322 and Untether’s ownstuck_after_tool_result_timeoutdefault, so the upstream CLI watchdog and Untether’s detector now fire on compatible timescales. User-providedCLAUDE_STREAM_IDLE_TIMEOUT_MSstill wins viasetdefaultsemantics. Two new tests intests/test_claude_runner.pylock in the new default and the user-override path #342 - security: Claude exec is now wrapped with
env -i KEY=VAL …so the resolved environment at exec time is exactly the allowlist fromutils/env_policy.filtered_env()— even when an upstream rc-file source, PAM/etc/environmentinjection, or wrapper script would otherwise re-introduce host vars after the parent’ssubprocess.spawn(env=…)is honoured. v0.35.2rc3 integration testing on@untether_dev_botproved the in-process filter holds (/proc/<untether-pid>/environclean) but a realBWS_ACCESS_TOKENstill reached Claude’s Bash-tool subprocess, undermining the headline #198 promise. Newwrap_with_env_i()helper inutils/subprocess.py; Clauderun_implswaps the resolvedcmdfor the wrap and passesenv=Noneto the subprocess so we don’t double-set. Pi runner left unchanged — Pi was already clean per the test report. Companion runtime audit (also new this rc): a one-shot/proc/<claude_pid>/environsample on firstsystem.initemits aclaude.env_audit.leaked_varstructlog WARNING when any non-allowlisted name is observed; gated by new[security] env_audit = true(default true). Reusesutils/env_policy.is_allowed(promoted from private_is_allowed, with a back-compat alias) so the allowlist remains a single source of truth. Newutils/env_audit.py(~80 LOC); 9 unit tests intests/test_env_audit.pyplus 6 intests/test_claude_runner.pycovering the wrap helper, the audit gate, dedup-per-session, and the disabled-via-settings path #361 #198 - fix:
/at <duration> <prompt>now respects the chat’s project mapping and engine — previously the delayed run fired on the global default engine with no cwd, ignoringdefault_engine = "pi"(or similar) on the project bound to the chat.AtCommand.handle()now snapshotsRunContext(viaruntime.default_context_for_chat(chat_id)) and the resolved engine (viaruntime.resolve_engine(...)) at schedule time and threads both through_PendingAtto the fire-time_RUN_JOBcall, mirroringTriggerDispatcher.dispatch_cron’s freeze-at-dispatch behaviour. cwd is resolved correctly downstream because_run_enginederives it from the forwarded context. Re-routing the chat between/atand fire keeps the original mapping; cancel via/canceland re-issue to pick up changes. New_FakeRuntimetest fixture intests/test_at_command.pyplus three new tests covering project-bound capture, unmapped-chat global-default capture, and fire-time forwarding #362 - feat: MCP catalog observability (P0#2 of #365). Claude Code’s
system.initevent ships each configured MCP server as{"name": "...", "status": "connected" | "pending" | "error" | "failed"}; Untether now logs a structuredcatalog_staleness.detectedWARNING once per (session, server, status) tuple whenever any server reports a non-connectedstatus at init time. Gated by newWatchdogSettings.detect_catalog_staleness(default true, observability only — no kill/recovery action). New_capture_mcp_catalog()helper insrc/untether/runners/claude.pysnapshots the raw list ontoClaudeStreamState.initial_mcp_serversfor future comparison work and dedups viaClaudeStreamState.catalog_staleness_logged: set[tuple[str, str, str]]. Companion experimental knobWatchdogSettings.notify_catalog_refresh(default false, opt-in) queues anmcp_statuscontrol_request on stdin after eachtool_result— the parent→CLI primitive documented in Anthropic’sclaude-agent-sdk-python(get_mcp_status()/reconnect_mcp_server()/toggle_mcp_server()). Drain happens inClaudeRunner._drain_catalog_refresh()alongside existing_drain_auto_approve/_drain_auto_deny, withcatalog.refresh_sentINFO on success andcatalog.refresh_failedWARN/ERROR on write errors. The upstream MCPnotifications/tools/list_changedmessage hinted at in the issue is server→client only per the MCP spec and therefore cannot be injected from outside;mcp_statusis the closest documented parent-side primitive. Request IDs use theut_catalog_refresh_<session_id>_<seq>namespace so they can’t collide with Claude Code’s ownreq_*IDs. Ten new tests intests/test_claude_runner.pycover: all-connected no-op, non-connected warning emission, per-session dedup, disabled-setting suppression, queue-on-tool_result (enabled + disabled paths), no-resume defensive no-op, drain serialisation, empty-queue no-op, ClosedResourceError recovery, and new_state propagation fromWatchdogSettings. No behaviour change for non-Claude engines #365 - fix: the plan-bypass set populated by an approved
ExitPlanMode(#283) is now also populated by a plain “Approve” onEdit/Write/Bashin plan mode. Resumed sessions where Claude skippedExitPlanModeand went straight into Edits previously re-prompted the user once per tool call — observed on@hetz_lba1_botv0.35.2rc1 as a 9-prompt repro for a single multi-file fix turn (oneEditper click, ~7 min wait between approvals, workflow effectively broken under--permission-mode plan)._DIFF_PREVIEW_TOOLSis now module-scoped insrc/untether/runners/claude.py;write_control_responseadds the session to_PLAN_EXIT_APPROVEDwhenever the approved tool isExitPlanModeor in_DIFF_PREVIEW_TOOLS, so the first approval in a turn unlocks the rest of that session’s diff_preview tools. Six new parametrized tests intests/test_claude_control.pycover Edit/Write/Bash/ExitPlanMode population, the deny-doesn’t-populate negative, and the non-diff-tool no-op. Verified end-to-end on@untether_dev_bot. Follow-up #370 will migrate this to a parent-initiatedset_permission_modecontrol request once the upstream primitive is wired #369
docs
- document
[triggers.server]port-conflict troubleshooting indocs/reference/triggers/triggers.mdwithss -tlnpdiagnosis step and theport = <N>remediation #320
v0.35.1 (2026-04-15)
fixes
-
diff preview approval gate no longer blocks edits after a plan is approved — the
_discuss_approvedflag now short-circuits diff preview as well asExitPlanMode, so once the user approves a plan outline the nextEdit/Writeruns without a second approval prompt #283 -
scripts/healthcheck.shexits prematurely underset -e—pass()/fail()used((var++))which returns the pre-increment value, trippingset -eon the first call so only the first check ever ran and the script always exited 1. Also, the error-log count piped journalctl throughgrep -c ., which counted-- No entries --meta lines as matches, producing false-positive log-error counts on clean systems. Now uses explicitvar=$((var+1))assignment and filters meta lines withgrep -vc '^-- '#302 -
fix multipart webhooks returning HTTP 500 —
_process_webhookpre-read the request body for size/auth/rate-limit checks, leaving the stream empty when_parse_multipartcalledrequest.multipart(). Now the multipart reader is constructed from the cached raw body, so multipart uploads work end-to-end; also short-circuits the post-parse raw-body write so the MIME envelope isn’t duplicated atfile_pathalongside the extracted file atfile_destination#280 -
fix webhook rate limiter never returning 429 —
_process_webhookawaited the downstream dispatch (Telegram outbox send,http_forwardnetwork call, etc.) before returning 202, which capped request throughput at the dispatch rate (~1/sec for private Telegram chats) and meant theTokenBucketLimiternever saw a real burst. Dispatch is now fire-and-forget with exception logging, so the rate limiter drains the bucket correctly and a burst of 80 requests againstrate_limit = 60now yields 60 × 202 + 20 × 429 #281 -
security: validate callback query sender in group chats — reject button presses from unauthorised users; prevents malicious group members from approving/denying other users’ tool requests #192
- also validate sender on cancel button callback — the cancel handler was routed directly, bypassing the dispatch validation
-
security: escape release tag name in notify-website CI workflow — use
jqfor proper JSON encoding instead of direct interpolation, preventing JSON injection from crafted tag names #193 -
security: sanitise flag-like prompts in Gemini and AMP runners — prompts starting with
-are space-prefixed to prevent CLI flag injection; movedsanitize_prompt()to base runner class for all engines #194 -
security: redact bot token from structured log URLs —
_redact_event_dictnow strips bot tokens embedded in Telegram API endpoint strings, preventing credential leakage to log files and aggregation systems #190 -
security: cap JSONL line buffer at 10 MB — unbounded
readline()on engine stdout could consume all available memory if an engine emitted a single very long line (e.g. base64 image in a tool result); now truncates and logs a warning #191 -
reduce stall warning false positives during Agent subagent work — tree CPU tracking across process descendants, child-aware 15 min threshold when child processes or elevated TCP detected, early diagnostic collection for CPU baseline, total stall warning counter that persists through recovery, improved “Waiting for child processes” notification messages #264
-
/pinguptime now resets on service restart — previously the module-level start time was cached across/restartcommands; nowreset_uptime()is called on each service start #234 -
add 38 missing structlog calls across 13 files — comprehensive logging audit covering auth verification, rate limiting, SSRF validation, codex runner lifecycle, topic state mutations, CLI error paths, and config validation in all engine runners #299
-
systemd: stop Untether being the preferred OOM victim — systemd user services inherit
OOMScoreAdjust=200andOOMPolicy=stopdefaults, which made Untether’s engine subprocesses preferred earlyoom/kernel OOM killer targets ahead of CLIclaude(oom_score_adj=0) and orphaned grandchildren actually consuming the RAM.contrib/untether.servicenow setsOOMScoreAdjust=-100(documents intent; the kernel clamps to the parent baseline for unprivileged users, typically 100) andOOMPolicy=continue(a single OOM-killed child no longer tears down the whole unit cgroup, which previously broke every live chat at once). Docs indocs/reference/dev-instance.mdupdated. Existing installs need to copy the unit file andsystemctl --user daemon-reload; staging picks up the change on the nextscripts/staging.sh installcycle #275
changes
-
timezone support for cron triggers — cron schedules can now be evaluated in a specific timezone instead of the server’s system time (usually UTC) #270
- per-cron
timezonefield with IANA timezone names (e.g."Australia/Melbourne") - global
default_timezonein[triggers]— per-crontimezoneoverrides it - DST-aware via Python’s
zoneinfomodule (zero new dependencies) - invalid timezone names rejected at config parse time with clear error messages
- per-cron
-
SSRF protection for trigger outbound requests — shared utility at
triggers/ssrf.pyblocks private/reserved IP ranges, validates URL schemes, and checks DNS resolution to prevent server-side request forgery in upcoming webhook forwarding and cron data-fetch features #276- blocks loopback, RFC 1918, link-local, CGN, multicast, reserved, IPv6 equivalents, and IPv4-mapped IPv6 bypass
- DNS resolution validation catches DNS rebinding attacks (hostname → private IP)
- configurable allowlist for admins who need to hit local services
- timeout and response-size clamping utilities
-
non-agent webhook actions — webhooks can now perform lightweight actions without spawning an agent run #277
action = "file_write"— write POST body to disk with atomic writes, path traversal protection, deny-glob enforcement, and on-conflict handlingaction = "http_forward"— forward payload to another URL with SSRF protection, exponential backoff on 5xx, and header template renderingaction = "notify_only"— send a templated Telegram message with no agent runnotify_on_success/notify_on_failureflags for Telegram visibility on all action types- default
action = "agent_run"preserves full backward compatibility
-
multipart form data support for webhooks — webhooks can now accept
multipart/form-dataPOSTs with file uploads #278- file parts saved with sanitised filenames, atomic writes, deny-glob and path traversal protection
- configurable
file_destinationwith template variables,max_file_size_bytes(default 50 MB) - form fields available as template variables alongside file metadata
-
data-fetch cron triggers — cron triggers can now pull data from external sources before rendering the prompt #279
fetch.type = "http_get"/"http_post"— fetch URL with SSRF protection, configurable timeout and headersfetch.type = "file_read"— read local file with path traversal protection and deny-globsfetch.parse_as— parse response asjson,text, orlines- fetched data injected into
prompt_templateviastore_asvariable (defaultfetch_result) on_failure = "abort"(default) sends failure notification;"run_with_error"injects error into prompt- all fetched data prefixed with untrusted-data marker
-
hot-reload for trigger configuration — editing
untether.toml[triggers]applies changes immediately without restarting Untether or killing active runs #269 (#285)- new
TriggerManagerclass holds cron and webhook config; scheduler readsmanager.cronseach tick; webhook server resolves routes per-request viamanager.webhook_for_path() - supports add/remove/modify of crons and webhooks, auth/secret changes, action type, multipart/file settings, cron fetch, and timezones
last_fireddict preserved across swaps to prevent double-firing within the same minute- unauthenticated webhooks logged at
WARNINGon reload (previously only at startup) - 13 new tests in
test_trigger_manager.py; 2038 existing tests still pass
- new
-
hot-reload for Telegram bridge settings —
voice_transcription, file transfer,allowed_user_ids,show_resume_line, and message-timing settings now reload without a restart #286TelegramBridgeConfigunfrozen (keepsslots=True) and gains anupdate_from(settings)methodhandle_reload()now applies changes in-place and refreshes cached loop-state copies; restart-only keys (bot_token,chat_id,session_mode,topics,message_overflow) still warn withrestart_required=trueroute_update()readscfg.allowed_user_idslive so allowlist changes take effect on the next message
-
/atcommand for one-shot delayed runs — schedule a prompt to run between 60s and 24h in the future with/at 30m Check the build; acceptsNs/Nm/Nhsuffixes #288- pending delays tracked in-memory (lost on restart — acceptable for one-shot use)
/canceldrops pending/attimers before they fire- per-chat cap of 20 pending delays; graceful drain cancels pending scopes on shutdown
- new module
telegram/at_scheduler.py; command registered asatentry point
-
run_oncecron flag —[[triggers.crons]]entries can setrun_once = trueto fire once then auto-disable; the cron stays in the TOML and re-activates on the next config reload or restart #288 -
trigger visibility improvements (Tier 1) — surface configured triggers in the Telegram UI #271
/pingin a chat with active triggers appends⏰ triggers: 1 cron (daily-review, 9:00 AM daily (Melbourne))- trigger-initiated runs show provenance in the meta footer:
🏷 opus 4.6 · plan · ⏰ cron:daily-review - new
describe_cron(schedule, timezone)utility renders common cron patterns in plain English; falls back to the raw expression for complex schedules RunContextgainstrigger_sourcefield;ProgressTracker.note_eventmerges engine meta over the dispatcher-seeded trigger so it survivesTriggerManagerexposescrons_for_chat(),webhooks_for_chat(),cron_ids(),webhook_ids()helpers
-
faster, cleaner restarts (Tier 1) — restart gap reduced from ~15-30s to ~5s with no lost messages #287
- persist last Telegram
update_idtolast_update_id.jsonand resume polling from the saved offset on startup; Telegram retains undelivered updates for 24h, so the polling gap no longer drops or re-processes messages Type=notifysystemd integration via stdlibsd_notify(socket.AF_UNIX, no dependency) —READY=1is sent after the firstgetUpdatessucceeds,STOPPING=1at the start of drainRestartSec=2incontrib/untether.service(was10) — faster restart after drain completescontrib/untether.servicealso addsNotifyAccess=main; existing installs must copy the unit file andsystemctl --user daemon-reload
- persist last Telegram
docs
- add update and uninstall guides + README transparency section #305
- new
docs/how-to/update.mdanddocs/how-to/uninstall.mdcovering pipx, pip, and source installs, plus config/data/systemd cleanup - README: “What Untether accesses” section (network, filesystem, process, credentials), update/uninstall one-liners in Quick Start, and cross-links throughout install/how-to pages
- new
- comprehensive v0.35.1 documentation audit — 8 gap fills across 121 files #306
group-chat.md: document callback sender validation in groups (#192)security.md: cross-reference button validation, fix misleading SSRF allowlist claim, add bot token auto-redaction tip (#190)plan-mode.md: document auto-approval after plan approval (#283)interactive-approval.md: admonition linking to plan bypass behaviourcommands-and-directives.md:/pingdescription now mentions uptime reset and trigger summary (#234)runners/amp/runner.md: addsanitize_prompt()note matching Pi/Gemini runners (#194)troubleshooting.md: document 10 MB engine output line cap (#191)glossary.md: add delayed run, webhook action, and hot-reload entries
v0.35.0 (2026-03-31)
fixes
- render plan outline as formatted text instead of raw markdown — outline messages now use
render_markdown()+split_markdown_body()so headings, bold, code, and lists display properly in Telegram #139 - add approve/deny buttons to the last outline message — users no longer need to scroll back up past long outlines to find the buttons #140
- delete outline messages on approve/deny — outline and notification messages are cleaned up immediately via module-level
_OUTLINE_REGISTRY, and stale approval keyboard on the progress message is suppressed #141 - scope AskUserQuestion pending requests by channel_id —
_PENDING_ASK_REQUESTSand_ASK_QUESTION_FLOWSwere global dicts with no chat scoping; a pending ask in one chat would steal the next message from any other chat, causing cross-chat contamination and lost messages #144- added
channel_idcontextvar (get_run_channel_id/set_run_channel_id) toutils/paths.py get_pending_ask_request()andget_ask_question_flow()now acceptchannel_idand filter by it- session cleanup now also clears stale pending asks and flows
- added
- standalone override commands (
/planmode,/model,/reasoning) now preserve allEngineOverridesfields instead of resetting unrelated overrides #124 - register input for system-level auto-approved control requests (Initialize, HookCallback, McpMessage, RewindFiles, Interrupt) so
updatedInputis included in the response — prevents ZodError in Claude Code #123 - reduce Telegram API default timeout from 120s to 30s — a single ReadTimeout on
editMessageTextcould make the bot appear unresponsive for up to 2 minutes;getUpdateslong-poll now uses a dedicated timeout oftimeout_s + 20so network failures are detected faster #145 - OpenCode error runs now show the error message instead of an empty body —
CompletedEvent.answerfalls back tostate.last_tool_errorwhen no priorTextevents were emitted; covers bothStepFinishandstream_end_eventspaths #146, #150 - Pi
/continuenow captures the session ID fromSessionHeader—allow_id_promotionwasFalsefor continue runs, preventing the resume token from being populated #147 - post-outline approval no longer fails with “message to be replied not found” — the “Approve Plan” button on outline messages uses the real ExitPlanMode
request_id, so the regular approve path now setsskip_reply=Truewhen outline messages were just deleted; also suppresses the redundant push notification after outline cleanup #148 - sanitise
text_linkentities with invalid URLs before sending to Telegram — localhost, loopback, file paths, and bare hostnames are converted tocodeentities instead, preventing silent 400 errors that drop the entire final message #157 - fix duplicate approval buttons after “Pause & Outline Plan” — both the progress message and outline message showed approve/deny buttons simultaneously; now only the outline message has approval buttons (with Cancel), progress keeps cancel-only; outline state resets properly for future ExitPlanMode requests #163
- hold ExitPlanMode request open after outline so post-outline Approve/Deny buttons persist — instead of auto-denying (which caused Claude to exit ~7s later), the control request is never responded to, keeping Claude alive while the user reads the outline #114, #117
- buttons use real
request_idfrompending_control_requestsfor direct callback routing - 5-minute safety timeout cleans up stale held requests
- buttons use real
- suppress stall auto-cancel when CPU is active — extended thinking phases produce no JSONL events but the process is alive and busy;
is_cpu_active()check prevents false-positive kills #114 - fix stall notification suppression when main process sleeping — CPU-active suppression now checks
process_state; when main process is sleeping (state=S) but children are CPU-active (hung Bash tool), notifications fire instead of being suppressed; stall message now shows tool name (“Bash tool may be stuck”) instead of generic “session may be stuck” #168 - suppress redundant cost footer on error runs — diagnostic context line already contains cost data, footer no longer duplicates it #120
- clarify /config default labels and remove redundant “Works with” lines #119
- Codex: always pass
--ask-for-approvalin headless mode — default tonever(auto-approve all) so Codex never blocks on terminal input;safepermission mode still usesuntrusted#184 - OpenCode: surface unsupported JSONL event types as visible Telegram warnings instead of silently dropping them — prevents silent 5-minute hangs when OpenCode emits new event types (e.g.
question,permission) #183 - stall warnings now succinct and accurate for long-running tools — truncate “Last:” to 80 chars, recognise
command:prefix (Bash tools), reassuring “still running” message when CPU active, drop PID diagnostics from Telegram messages, only say “may be stuck” when genuinely stuck #188- frozen ring buffer escalation now uses tool-aware “still running” message when a known tool is actively running (main sleeping, CPU active on children), instead of alarming “No progress” message
- OpenCode model name missing from footer when using default model —
build_runner()now reads~/.config/opencode/opencode.jsonto detect the configured default model so the🏷footer always shows the model (e.g.openai/gpt-5.2) even without anuntether.tomloverride #221 - OpenCode model override hint —
/configand engine model sub-page now showprovider/model (e.g. openai/gpt-4o)instead of the unhelpful “from provider config”, guiding users to use the required provider-prefixed format #220 - Codex footer missing model name — Codex runner always includes model in
StartedEvent.metaso the footer shows the model even when no override is set #217 /planmodecommand worked in non-Claude engine chats — now gated to Claude-only with a helpful message; Codex/Gemini users are directed to/config→ Approval policy #216/usageshowed Claude subscription data in non-Claude engine chats — now gated to subscription-supported engines with an engine-specific error message #215/exportshowed duplicate “Session Started” headers for resumed sessions — deduplicated so only the firstStartedEventrenders #218- Gemini CLI prompt injection — prompts starting with
-were parsed as flags when passed via-p <value>; now uses--prompt=<value>to bind the value directly #219 /newcommand now cancels running processes before clearing sessions — previously only cleared resume tokens, leaving old Claude/Codex/OpenCode processes running (~400 MB each), worsening memory pressure and triggering earlyoom kills #222- auto-continue no longer triggers on signal deaths (rc=143/SIGTERM, rc=137/SIGKILL) — earlyoom kills have
last_event_type=userwhich matched the upstream bug detection, causing a death spiral where 4 killed sessions were immediately respawned into the same memory pressure #222 /newcommand triggers engine run instead of clearing sessions whentopics.enabled=false—/newwas only handled in_dispatch_builtin_commandwhen topics were enabled; moved/newout of thetopics.enabledgate to handle all modes (topic, chat session, stateless), mirroring how/ctxalready works; also removed unreachable early routing code #236- Gemini engine stuck at “starting · 0s” — Gemini CLI outputs a non-JSON warning (
MCP issues detected...) on stdout before the first JSONL event, corrupting the line;decode_jsonl()now strips non-JSON prefixes by finding the first{and retrying parse #231 /configAsk mode toggle inverted —_toggle_rowdefault wasFalsebut display default was “on”, causing the button to show “Ask: off” when the effective state was on; pressing it appeared to do nothing #232- diff preview approval buttons not rendered after outline flow —
_outline_sentflag inProgressEditsstripped ALL subsequent approval buttons, not just outline-related ones; now only strips buttons forDiscussApprovalactions #233 - prevent duplicate control response for already-handled requests #229 (#230)
- fix
render_markdownentity overflow when text ends with a fenced code block — entity offsets now clamped to the UTF-16 text length after trailing newline stripping, preventing Telegram 400 errors #59 /confignow reflects project-leveldefault_engine— previously showed Claude-specific buttons (Plan mode, Ask mode, etc.) for chats routed to Codex/Pi via project config #60- non-Claude runners (Codex, Pi) now populate model name in
StartedEvent.meta— footer previously showed permission mode only (e.g.🏷 plan) without the model #62 - fix liveness watchdog false positive auto-cancel on long-running sessions — actively working sessions with CPU activity and TCP connections were being killed during extended thinking/processing phases #115
- fix reply-to resume when emoji prefix is present — the
↩️prefix on resume footer lines broke all 6 engine regexes;extract_resume()now strips emoji prefixes before matching #134 /configsub-pages now show resolved on/off values instead of “default” — body text now matches the toggle button state using_resolve_default(), removing the confusing mismatch #152- expired control requests now auto-denied after 5-minute timeout — previously the timeout cleanup removed local tracking but did not send a deny response, leaving the Claude subprocess blocked indefinitely on stdin #32
/exportno longer returns sessions from wrong chat — session recording was not scoped by channel_id, so/exportin one chat could return another engine’s session data #33- fix
KillMode=control-groupbypassing drain and causing 150s restart delay —contrib/untether.servicenow usesKillMode=mixedwhich sends SIGTERM to the main process first (drain works), then SIGKILL to remaining cgroup processes (orphaned MCP servers, containers cleaned up instantly) #166process: orphaned children survive across restarts, accumulating memory (#88)control-group: kills all processes simultaneously, bypassing drain (#166)mixed: best of both — graceful drain then forced cleanup
- AMP CLI
-xflag regression — double-dash separator inbuild_args()caused AMP to interpret-xas a subcommand name instead of a flag, breaking execute mode for all prompts #245
docs
- update integration test chat IDs from stale
ut-dev:to currentut-dev-hf:chats #238 - investigation: orphaned
workerdprocesses from Bash tool children are upstream Claude Code bug — Untether’s process group cleanup is correct; Claude Code spawns Bash tool shells in their own session group which Untether cannot reach; no TTY/SIGHUP cascade in headless mode #257
changes
- logging audit: fill gaps in structlog coverage — elevate settings loader failures from DEBUG to WARNING (footer, watchdog, auto-continue, preamble), add access control drop logging, add executor
handle.engine_resolvedinfo log, elevate outline cleanup failures to WARNING, add credential redaction for OpenAI/GitHub API keys, add file transfer success logging, bindsession_idin structlog context vars, add media group/cost tracker/cancel debug logging #254 - CI: expand ruff lint rules from 7 to 18 — add ASYNC, LOG, I (isort), PT, RET, RUF (full), FURB, PIE, FLY, FA, ISC rule sets; auto-fix 42 import sorts, clean 73 stale noqa directives, fix unused vars and useless conditionals; per-file ignores for test-specific patterns #255
- Gemini: default to
--approval-mode yolo(full access) when no override is set — headless mode has no interactive approval path, so the CLI’s read-only default disabled write tools entirely, causing multi-minute stalls as Gemini cascaded through sub-agents #244, #248 - expand error hints coverage — add model not found, context length exceeded, authentication, content safety, CLI not installed, SSL/TLS, invalid request, disk/permission, AMP-specific auth, Gemini result status, and account suspension error categories #246
/continuecommand — cross-environment resume; pick up the most recent CLI session from Telegram using each engine’s native continue flag (--continue,resume --last,--resume latest); supported for Claude, Codex, OpenCode, Pi, Gemini (not AMP) #135ResumeTokenextended withis_continue: bool = False- all 6 runners’
build_args()updated to handle continue tokens /continuehandled as reserved command in Telegram loop- new how-to guide:
docs/how-to/cross-environment-resume.md
/configUX overhaul — 2-column toggle pattern replaces all 3-button rows with single[✓ Feature: on]toggle +[Clear]for better mobile tap targets; merged Engine + Model into single page; max 2 buttons per row on home page; plan mode 2+1 split layout #132- resume line toggle — per-chat
show_resume_lineoverride via/configsettings; configurable via EngineOverrides #128 - cost budget settings — per-chat
budget_enabledandbudget_auto_canceloverrides on Cost & Usage page in/config#129 - model metadata improvements — shorten model display names in footer:
claude-opus-4-6[1m]→opus 4.6 (1M),auto-gemini-3→gemini-3; all engines populate model info fromStartedEvent.meta#132 - resume line formatting — visual separation with blank line and
↩️prefix in final message footer #127 - agent-initiated file delivery — agents write files to
.untether-outbox/during a run; Untether sends them as Telegram documents on completion with📎 filename (size)captions; flat scan, deny-glob security, size limits, auto-cleanup #143- new module
telegram/outbox_delivery.pywithscan_outbox(),cleanup_outbox(),deliver_outbox_files() ExecBridgeConfiggainssend_filecallback +outbox_config(transport-agnostic)- preamble updated with outbox instructions for all 6 engines
- config:
outbox_enabled,outbox_dir,outbox_max_files,outbox_cleanupin[transports.telegram.files]
- new module
- orphan progress message cleanup on restart — active progress messages are persisted to
active_progress.json; on startup, orphan messages from a prior instance are edited to show “⚠️ interrupted by restart” with no keyboard #149- new module
telegram/progress_persistence.pywithregister_progress(),unregister_progress(),load_active_progress(),clear_all_progress() runner_bridge.pyregisters on progress send, unregisters on ephemeral cleanuptelegram/loop.pycleans up orphans before sending startup message
- new module
- expand pre-run permission policies for Codex CLI and Gemini CLI in
/config#131- Codex: new “Approval policy” page — full auto (default) or safe (
--ask-for-approval untrusted) - Gemini: expanded approval mode from 2 to 3 tiers — read-only, edit files (
--approval-mode auto_edit), full access - both engines show “Agent controls” section on
/confighome page with engine-specific labels
- Codex: new “Approval policy” page — full auto (default) or safe (
- suppress stall Telegram notifications when CPU-active; heartbeat re-render keeps elapsed time counter ticking during extended thinking phases #121
- temporary debug logging for hold-open callback routing — will be removed after dogfooding confirms #118 is resolved
- auto-continue mitigation for Claude Code bug — when Claude Code exits after receiving tool results without processing them (bugs #34142, #30333), Untether detects via
last_event_type=userand auto-resumes the session #167AutoContinueSettingswithenabled(default true) andmax_retries(default 1) in[auto_continue]config section- detection based on protocol invariant: normal sessions always end with
last_event_type=result - sends “⚠️ Auto-continuing — Claude stopped before processing tool results” notification before resuming
- emoji button labels and edit-in-place for outline approval — ExitPlanMode buttons now show ✅/❌/📋 emoji prefixes; post-outline “Approve Plan”/“Deny” edits the “Asked Claude Code to outline the plan” message in-place instead of creating a second message #186
- redesign startup message layout — version in parentheses, split engine info into “default engine” and “installed engines” lines, italic subheadings, renamed “projects” to “directories” (matching
dir:footer label), added bug report link #187 - show token usage counts for non-Claude engines — completion footer now displays
💰 26.0k in / 71 outfor Codex, OpenCode, Pi, Gemini, and Amp when token data is available #36 - include CLI versions in startup diagnostics — startup message now shows detected engine CLI versions for easier debugging of outdated or mismatched tools #38
tests
- 8 new outline UX tests: markdown rendering with entities, approval keyboard on last chunk, multi-chunk keyboard placement, ref tracking, deletion on approval transition, deletion on keyboard change, safety-net cleanup, no double-deletion #139, #140, #141
- 22 new outbox delivery tests: scan (empty, single, sorted, max_files, deny globs, size limit, empty file, symlink, subdir), cleanup (delete, keep unsent, already gone), delivery (send, cleanup, no-cleanup, empty, send failure), integration (after completion, disabled, error run) #143
- 4 new cross-chat ask isolation tests: pending ask scoped by channel, correct channel returned, flow scoped by channel, translate registers with channel_id #144
- 99 new
/continuetests: 46 auto-router assertions (continue token handling, engine routing) + 53 build-args assertions (continue flags for all 6 engines) #135 - 195
/configtests covering home page, all sub-pages, toggle actions, callback routing, button layout, engine-aware visibility #132 - 7 new OpenCode error message tests: Error event with no prior text, process_error_events, stream_end_events, last_tool_error fallback on StepFinish, last_text takes priority over tool error, tool error status captures last_tool_error, stream_end_events fallback #146, #150
- 3 new Pi /continue tests: allow_id_promotion flag, session ID promotion from SessionHeader, normal resume no promotion #147
- 3 new timeout tests: default 30s timeout, getUpdates per-request timeout, sendMessage uses default #145
- 3 new discuss-approval skip_reply tests: approve and deny results set skip_reply=True, dispatch callback skip_reply sends without reply_to #148
- 8 new progress persistence tests: register/load roundtrip, unregister, missing file, corrupt file, non-dict, multiple entries, clear all, clear nonexistent #149
- 2 new dual-button tests: outline strips approval from progress, outline state resets on approval disappear #163
- hold-open outline flow: new tests for hold-open path, real request_id buttons, pending cleanup, approval routing #114
- stall suppression: tests for CPU-active auto-cancel, notification suppression when cpu_active=True, notification fires when cpu_active=False #114, #121
- cost footer: tests for suppression on error runs, display on success runs #120
- 10 new auto-continue tests: detection function (bug scenario, non-claude engine, cancelled session, normal result, no resume, max retries) + settings validation (defaults, bounds) #167
- 2 new stall sleeping-process tests: notification not suppressed when main process sleeping (state=S), stall message includes tool name #168
- 8 new
_read_opencode_default_modeltests: valid config, missing file, invalid JSON, empty model, no model key, build_runner fallback, untether config priority, no OC config #221 - engine command gate tests:
/planmodeClaude-only,/usagesubscription-engine-only #215, #216 - export dedup test: duplicate started events deduplicated in markdown export #218
- Gemini
--prompt=build_args test #219 - Gemini integration test stall diagnosed — root cause was missing
--approval-mode yoloin test chat config; Gemini CLI defaults to read-only mode with write tools disabled; set full access via/configforut-dev-hf: geminitest chat; U1 now passes in 56s (was 8–18 min stall) #244 - 10 new
/newcancellation tests:_cancel_chat_taskshelper (None, empty, matching, other chats, already cancelled, multiple), chat/newwith running task, cancel-only no sessions, no tasks no sessions, topic/newwith running task #222 - 12 new auto-continue signal death tests:
_is_signal_death(SIGTERM, SIGKILL, negative, normal, None),_should_auto_continue(rc=143, rc=137, rc=-9, rc=-15 blocked; rc=0, rc=None, rc=1 allowed),proc_returncodedefault onJsonlStreamState#222
docs
- document OpenCode lack of auto-compaction as a known limitation — long sessions accumulate unbounded context with no automatic trimming; added to runner docs and integration testing playbook #150
v0.34.4 (2026-03-09)
fixes
- preamble hook awareness: add constraint to default preamble instructing Claude that if hooks fire at session end, the final response must still contain the user’s requested content — hook concerns are secondary and should be noted after main content, never instead of it #107
- addresses content displacement when Claude Code plugin Stop hooks (e.g. PitchDocs context-guard) consume the final Telegram message with meta-commentary instead of user-requested content
UNTETHER_SESSIONenv var: Claude runner now setsUNTETHER_SESSION=1in subprocess environment, enabling Claude Code hooks to detect Untether sessions and adjust behaviour (e.g. PitchDocs context-guard skips blocking Stop hooks in Telegram) #107
docs
- audit: PitchDocs context-guard interference analysis — root cause (false positive from
git status --porcelainon untracked hook infrastructure), cross-project comparison (BIP/Scout/Brand Copilot/littlebearapps.com), recommendations for both Untether and PitchDocs #107
v0.34.3 (2026-03-08)
fixes
- tool-aware stall threshold: 10-minute threshold (
_STALL_THRESHOLD_TOOL = 600s) when a tool action is started but not completed, preventing false stall warnings during long-running Bash commands, Agent tasks, and TaskOutput waits #105- three-tier system: normal (5 min), running tool (10 min), pending approval (30 min)
_has_running_tool()checks most recent action state- stall threshold selection logged at info level with reason
- progress message edit failure: log warning and fall back to sending a new message when the initial “queued” → “starting” edit fails, preventing stuck “queued” messages #103
- approval keyboard edit failure: use
wait=Truefor keyboard transitions (approval buttons appearing), log keyboard attach at info level and edit failures at warning level for diagnostics #104transport.edit.failedwarning inTelegramTransport.edit()whenwait=Trueedit returnsNoneprogress_edits.keyboard_attachinfo log on keyboard transitionsprogress_edits.keyboard_edit_failedwarning when keyboard edit fails- transport errors upgraded from debug to warning level
/usage429 rate limit: downgrade from error to warning level, preventing untether-issue-watcher noise for transient rate limits #89
changes
- session cleanup structured reporting:
_cleanup_session_registries()now logs cleaned registry names at info level for post-mortem analysis #93- session registration (
claude_runner.registered,session_stdin.registered) upgraded to info level
- session registration (
- JSONL decode failure logged at warning level with truncated line content (first 200 chars)
- runner spawn now logs CLI args in
runner.startevent - no-events session warning:
session.summary.no_eventslogged when a non-cancelled session completes with zero events
tests
- new test coverage for tool-aware stall threshold, keyboard edit failure recovery, edit-fail fallback send, session cleanup tracking, stderr sanitisation #85, build args validation, loop coverage
v0.34.2 (2026-03-08)
fixes
- stall monitor loops forever after laptop sleep — no auto-cancel,
/cancelrequires reply #99- stall auto-cancel: dead process detection (immediate), no-PID zombie cap (3 warnings), absolute cap (10 warnings)
- early PID threading:
last_pidset at subprocess spawn, polled byrun_runner_with_cancelbeforeStartedEvent - standalone
/cancelfallback: cancels single active run without requiring reply; prompts when multiple runs active queued_for_chat()method onThreadSchedulerfor standalone cancel of queued jobs- approval-aware stall threshold: 30 min when waiting for user approval (inline keyboard detected), 5 min otherwise
v0.34.1 (2026-03-07)
fixes
- session stall diagnostics: add
/procprocess diagnostics (CPU, RSS, TCP, FDs, children), progressive stall warnings, liveness watchdog, event timeline tracking, and session completion summary #97- new
utils/proc_diag.pymodule:collect_proc_diag(),format_diag(),is_cpu_active() JsonlStreamStatetrackslast_stdout_at,event_count,last_event_type,recent_eventsring buffer,stderr_capture- PID auto-injected into
StartedEvent.metavia base class (all engines) - progressive
_stall_monitor: repeating warnings every 3 min with fresh/procsnapshots and Telegram notifications - liveness watchdog: detects alive-but-silent subprocesses after 10 min with diagnostics; optional auto-kill (off by default, triple safety gate)
session.summarystructured log on every session completion[watchdog]config section:liveness_timeout,stall_auto_kill,stall_repeat_seconds
- new
- stream threading broken:
_ResumeLineProxyhidescurrent_streamfromProgressEdits, causingevent_count=0andlast_event_type=Nonefor all engines #98- add
current_streamproperty to_ResumeLineProxyand_PreludeRunner - set
self.current_stream = streamin Claude’s overriddenrun_impl - use
stream.stderr_captureinstead of separatestderr_linesin Claude’srun_impl
- add
v0.34.0 (2026-03-07)
fixes
- ExitPlanMode stuck after cancel + resume: stale outline_guard not cleaned up #93
- extract
_cleanup_session_registries()helper, call fromrun_implfinally block
- extract
- stall monitor fails to detect stalls when no events arrive after session start; no Telegram notification #95
- initialise
_last_event_atfromclock()instead of0.0so threshold works from session start - send
⏳ No progress for N minTelegram notification on stall detection (previously journal-only)
- initialise
changes
- show token-only cost footer for Gemini and AMP —
_format_run_cost()no longer requirestotal_cost_usd; renders💰 26.0k in / 71 outwhen only token data is available #94- Gemini
_build_usage(): extractcached→cache_read_tokensandduration_msfrom StreamStats - AMP
_accumulate_usage(): accumulatecache_creation_input_tokensandcache_read_input_tokens
- Gemini
- add Gemini CLI approval mode toggle in
/config— “read-only” (default, write tools blocked) or “full access” (--approval-mode=yolo); tied into existing plan mode infrastructure via sharedpermission_modefield #90- home page shows “Approval mode” label and button when engine is Gemini
- sub-page with Read-only/Full access toggle
PERMISSION_MODE_SUPPORTED_ENGINESconstant for engine-aware gating
v0.33.5 (2026-03-07)
fixes
- downgrade
control_response.failedClosedResourceError from error to warning — race condition when Telegram callback arrives after session stdin closes;write_control_response()now returnsboolandsend_claude_control_response()propagates it #61- also downgrade
auto_approve_failedandauto_deny_failedfor consistency
- also downgrade
- add subprocess watchdog — detects orphaned child processes (e.g. MCP servers) holding stdout pipes open after parent exits; kills process group after grace period #91
- add stall monitor — warns when no progress events arrive for 5 minutes; clears on recovery #92
- handle
ClosedResourceErroriniter_bytes_lines()on abrupt pipe close
v0.33.4 (2026-03-06)
fixes
- add render debouncing to batch rapid progress events — configurable
min_render_interval(default 2.0s) prevents flooding Telegram edits #88- first render is never debounced; subsequent renders sleep for remaining interval
group_chat_rpsnow configurable in[progress](default 20/60, matching Telegram limit)
- make approval notification sends non-blocking —
transport.send()for push notifications runs in a background task instead of stalling the render loop #88
docs
- document
KillMode=process→KillMode=control-groupfix for systemd service files — orphaned MCP servers accumulate across restarts, consuming 10+ GB #88
v0.33.3 (2026-03-06)
fixes
- block ExitPlanMode after cooldown expires when no outline has been written — adds outline guard check before time-based cooldown #87
_OUTLINE_PENDING+max_text_len_since_cooldown < 200guard fires regardless of cooldown expiry- strengthened deny/escalation messages with consequence warnings and concrete framing
v0.33.2 (2026-03-06)
fixes
- warn at startup when
allowed_user_idsis empty — any chat member can run commands without filtering #84 - sanitise subprocess stderr before exposing to Telegram — redact absolute file paths and URLs #85
- truncate prompts to 100 chars in INFO logs to reduce sensitive data exposure #86
v0.33.1 (2026-03-06)
fixes
- fall back to plain commonmark renderer when
linkify-it-pyis missing instead of crash-looping on startup #83
v0.33.0 (2026-03-06)
changes
- add effort control for Claude Code —
--effortflag with low/medium/high levels via/reasoningand/config#80 - show model version numbers in footer — e.g.
opus 4.6instead ofopus#80 - show effort level in meta line between model and permission mode (e.g.
opus 4.6 · medium · plan) #80 - rename all user-facing “Claude” to “Claude Code” for product clarity #81
- error messages, button labels, config descriptions, notification text
- engine IDs (
"claude") and model/subscription references unchanged
fixes
- signal error hints (SIGTERM/SIGKILL/SIGABRT) no longer hardcode
/claude— now engine-agnostic #81 - config reasoning page showed bare “Claude” instead of “Claude Code” due to
.capitalize()#81 /usageHTTP errors now show descriptive messages (e.g. “Rate limited by Anthropic — too many requests”) instead of bare status codes #81/usagenow handles ConnectError and TimeoutException with specific recovery guidance #81- add error hints for “finished without a result event” and “finished but no session_id” — covers all 6 engines #81
docs
- update 27 documentation files with Claude Code naming
- update troubleshooting guide with new error hint categories (process/session errors)
- update inline settings guide — reasoning now shows Claude Code and Codex as supported
- update model-reasoning guide with Claude Code effort levels
tests
- add 8 new error hint tests (signal engine-agnostic, cross-engine process/session errors)
- update model version tests for
_short_model_name()(e.g.opus 4.6) - add effort/meta line tests for
format_meta_line() - update config command tests for Claude Code reasoning support
v0.32.1 (2026-03-06)
fixes
- missing
linkify-it-pydependency crashes service on startup after 0.32.0 upgrade #79markdown-it-pylinkify feature requires optionallinkify-it-pypackage- changed dependency to
markdown-it-py[linkify]to include the extra
docs
- cross-platform process management instructions — platform tabs for restart/logs, contextualise systemd as Linux-specific
v0.32.0 (2026-03-06)
changes
- add Gemini CLI runner with
--approval-modepassthrough for plan mode support #991 - add Amp CLI runner with mode selection and
--stream-json-inputsupport #988, #989 - add
/threadscommand for Amp thread management #993 - track Amp subagent
parent_tool_use_idin action detail #992 - redesign
/confighome page with grouped sections (Agent controls, Display, Routing), inline hints, and help links - add version information footer to
/confighome page - compact startup message — only show enabled features (topics, triggers), merge engine and default on one line
fixes
- Gemini CLI
-pflag compatibility (changed from boolean to string argument) #75 - Amp CLI
-xflag requires prompt as direct argument #76 - Amp CLI uses
--modenot--modelfor model override #77 - Amp
/threadstable parsing —threads list/searchdon’t support--json#78 - standardise unrecognised-event debug logging across all engine runners
- add structured logging for cost budget alerts and exceeded events
- improve atomic JSON state write error handling and logging
- add timeout and generic exception handlers to voice transcription
- add structured logging for plugin load errors
- improve config cleanup error logging with error type details
docs
- update README engine compatibility table with Gemini CLI and Amp columns
- add
[gemini]and[amp]configuration sections to config reference - various doc formatting and link updates
tests
- add comprehensive tests for redesigned
/configcommand (+199 lines) - simplify startup message generation tests
- add cross-engine test coverage for Gemini and Amp runners
v0.31.0 (2026-03-05)
changes
- merge API cost and subscription usage into unified “Cost & usage” config page #67
- make
/authcodex-only, move auth status to/stats auth#68 - add docs link to
/confighome page #69
fixes
- widen device code regex for real codex output format #40
- improve
/authinfo message wording #70 - put Cost & usage and Trigger on same row in
/config#71 - 5 optimisations from 4-engine test sweep #72
docs
- add triggers/webhooks/cron architecture and how-to documentation
- expand trigger mode and group chat documentation
v0.30.0 (2026-03-04)
changes
- add
/statscommand — persistent per-engine session statistics (runs, actions, duration) with today/week/all periods #41SessionStatsStorewith JSON persistence in config dir- auto-prune data older than 90 days
- recording hook in
runner_bridge.pyon run completion
- add
/authcommand — headless engine re-authentication via Telegram #40- runs
codex login --device-authand sends verification URL + device code /auth statuschecks CLI availability- concurrent guard and 16-minute timeout
- runs
- add API cost and subscription usage toggles to
/configmenu- per-chat persistent settings for
show_api_costandshow_subscription_usage
- per-chat persistent settings for
fixes
- diff preview on approval buttons was dead code — Edit/Write/Bash were always auto-approved before reaching the diff preview path #52
- when
diff_previewis enabled, previewable tools now route through interactive approval - default behaviour (diff_preview off) unchanged
- when
tests
- 16 new diff preview gate tests (parametrised across tools and settings)
- 18 new session stats storage tests (record, aggregate, persist, prune, corrupt file)
- 13 new stats command tests (formatting, duration, handle with args)
- 13 new auth command tests (ANSI stripping, device code parsing, concurrent guard, status)
v0.29.0 (2026-03-03)
changes
- add diff preview toggle to
/configmenu — per-chat persistent setting to enable/disable diff previews in tool approval messages #58- Claude-only; default is on (matches existing behaviour)
- stored in
EngineOverrides, gated viaEngineRunOptionsContextVar - home page layout: new “Diff preview” button alongside Verbose
fixes
- remove redundant local import of
get_run_optionsinclaude.pythat shadowed the module-level import
tests
- 25 new tests: diff preview config page (18), gating logic (4), engine override merge (2), toast labels (3)
- updated home button test to assert
config:dppresence for Claude
v0.28.1 (2026-03-03)
changes
- add 20 new API/LLM error hints for graceful failure during provider outages #54
- subscription limits: Claude “out of extra usage” / “hit your limit” — tells user session is saved, wait for reset
- billing errors: OpenAI
insufficient_quota,billing_hard_limit_reached; Googleresource_exhausted - API overload: Anthropic
overloaded_error(529), generic “server is overloaded” - server errors: 500
internal_server_error, 502bad gateway, 503service unavailable, 504gateway timeout - rate limits:
too many requests(extends existingrate limitpattern) - network:
connecttimeout, DNS failure, network unreachable - auth:
openai_api_key,google_api_key(extends existinganthropic_api_key)
fixes
- deduplicate error messages when answer and error share the same first line (e.g. Claude subscription limits showed “You’re out of extra usage” twice) #55
- remove Approve/Deny buttons from AskUserQuestion option keyboards — only option buttons and “Other (type reply)” shown #56
- push notification for AskUserQuestion now says “Question from Claude” instead of “Action required — approval needed” #57
tests
- 19 new tests for API error hint patterns: subscription limits, billing, overload, server errors, network, ordering
- 2 new tests for error/answer deduplication in runner_bridge #55
- negative assertions for Approve/Deny absence in option button test #56
v0.28.0 (2026-03-02)
changes
- interactive ask mode — AskUserQuestion renders option buttons in Telegram, sequential multi-question flows (1 of N), “Other (type reply)” fallback, and structured
updatedInputresponses #51/configtoggle: “Ask mode” sub-page (Claude-only) to enable/disable interactive questions- dynamic preamble encourages or discourages AskUserQuestion based on toggle state
- auto-deny when toggle is OFF — Claude proceeds with defaults instead of asking
- Gemini CLI and Amp engine runners added (coming soon — not yet released for production use)
fixes
- synthetic Approve Plan button now returns an error when session has already ended, instead of silently succeeding #50
- session-alive check in
da:button handler (claude_control.py) - stale
_REQUEST_TO_SESSIONentries cleaned up during session end
- session-alive check in
- ReadTimeout in usage footer no longer kills final message delivery — chat appeared frozen when Anthropic usage API was slow #53
tests
- 27 new tests for ask mode: option button rendering, multi-question flow management, structured answer responses, config toggle, auto-deny when OFF
- 4 new tests for synthetic approve after session ends (#50): dead approve, dead deny, active approve, session cleanup
docs
- updated inline-settings how-to, interactive-control tutorial, README, and CLAUDE.md for ask mode
- added ask mode to
/configcommand description and features list - Gemini CLI and Amp listed as “coming soon” in README engines table
v0.27.1 (2026-03-02)
fixes
- add ReadTimeout error hint for transient network timeouts #15
- resolve all ty type checker warnings (109 → 0)
docs
- fix PyPI logo rendering — use absolute raw GitHub URL so SVG displays on PyPI
- add Upgrading section to README with uv/pipx upgrade + restart commands
- point project URLs to GitHub for PyPI verified details
v0.27.0 (2026-03-01)
fixes
- per-chat outbox pacing — progress edits to different chats no longer serialise through a single global timer; each chat tracks its own rate-limit window independently #48
_next_at[chat_id]dict replaces scalarnext_at- new
_pick_ready(now)selects from unblocked chats;retry_atstays global (429) - 7 group chats now update in parallel (~0s total) vs old 7 × 3s = 21s delay
changes
/configmodel sub-page — view current model override and clear it; button always visible on home page #47/configreasoning sub-page — select reasoning level (minimal/low/medium/high/xhigh) via buttons; only visible when engine supports reasoning (Codex) #47
tests
- 7 per-chat pacing tests: independent chats, private vs group intervals, global retry_at, cross-chat priority, same-chat pacing, 7 concurrent chats, chat_id=None independence
- 54 model + reasoning /config tests: sub-page rendering, toggle actions, engine-aware visibility, toast mappings, override persistence, cross-field preservation
v0.26.0 (2026-03-01)
changes
/configinline settings menu — BotFather-style inline keyboard for toggling plan mode, verbose, engine, and trigger; edits message in-place #47- confirmation toasts on toggle actions (e.g. “Plan mode: off”)
- auto-return to home page after setting changes
- engine-aware plan mode — hidden for non-Claude engines
docs
- comprehensive tutorials and how-to guides — 15 new/expanded guides covering daily use, interactive control, messaging, cost management, security, and operations
- inline settings how-to (
docs/how-to/inline-settings.md)
tests
- add 62-test suite for
/config(toast permutations, engine-aware visibility, auto-return, callback dispatch)
v0.25.3 (2026-03-01)
fixes
- increase SIGTERM→SIGKILL grace period from 2s to 10s — gives engines time to flush session transcripts before forced kill #45
- add
error_during_executionerror hint — users see actionable recovery guidance when a session fails to load #45 - auto-clear broken session on failed resume — when a resumed run fails with 0 turns, the saved token is automatically cleared so the next message starts fresh #45
- new
clear_engine_session()onChatSessionStoreandTopicStateStore on_resume_failedcallback threaded throughhandle_message→_run_engine→wrap_on_resume_failed
- new
tests
- add
ErrorReturnstep type toScriptRunnermock for simulating engine failures - add 4 auto-clear unit tests (zero-turn error, success, partial turns, new session)
- add SIGTERM→SIGKILL 10s timeout assertion test
- add 2
error_during_executionhint tests (resumed and new session variants) - integration-tested across Claude, Codex, and OpenCode via untether-dev
v0.25.2 (2026-03-01)
fixes
- add actionable error hints for SIGTERM/SIGKILL/SIGABRT signals — users now see recovery guidance instead of raw exit codes #44
docs
- add
contrib/untether.serviceexample withKillMode=processandTimeoutStopSec=150for graceful shutdown #44 - update
docs/reference/dev-instance.mdwith systemd configuration section and graceful upgrade path - update
CLAUDE.mdwith graceful upgrade comment
tests
- add 5 signal hint tests (SIGTERM, SIGKILL, SIGABRT, case insensitivity, no false positives)
v0.25.1 (2026-03-01)
changes
- default
message_overflowchanged from"trim"to"split"— long final responses now split across multiple Telegram messages instead of being truncated #42
v0.25.0 (2026-02-28)
changes
/verbosecommand and[progress]config — per-chat verbose toggle shows tool details (file paths, commands, patterns) in progress messages; global verbosity and max_actions settings #25- Pi context compaction events — render
AutoCompactionStart/AutoCompactionEndas progress actions with token counts #26 UNTETHER_CONFIG_PATHenv var — override config file location for multi-instance setups #27- ExceptionGroup unwrapping, transport resilience, and debug logging improvements #30
fixes
- outline not visible in Pause & Outline Plan flow — outline was scrolled off by max_actions truncation and lost in final message #28
- footer double-spacing — sulguk trailing
\n\ncaused blank lines between footer items (context/meta/resume) #29
docs
- add dev instance quickref (
docs/reference/dev-instance.md) documenting production vs dev separation - add dev workflow rule (
.claude/rules/dev-workflow.md) preventing accidental production restarts - update CLAUDE.md and README with verbose mode, Pi compaction, and config path features
tests
- add test suites for verbose command, verbose progress formatting, config path env var, cooldown bypass, and Pi compaction (44 new tests)
v0.24.0 (2026-02-27)
changes
- agent context preamble — configurable
[preamble]injects Telegram context into every runner prompt, informing agents they’re on Telegram and requesting structured end-of-task summaries; engine-agnostic (Claude, Codex, OpenCode, Pi) #21 - post-outline Approve/Deny buttons — after “Pause & Outline Plan”, Claude writes the outline then Approve/Deny buttons appear automatically in Telegram; no need to type “approved” #22
fixes
- improved discuss denial message for resumed sessions — explicitly tells Claude to rewrite the outline even if one exists in prior context #23
- discuss cooldown state cleaned up on session end — prevents stale cooldown leaking into resumed runs #23
docs
- update plan-mode how-to with post-outline approval flow
- update control-channel rule with new registries and discuss-approval mechanism
- update CLAUDE.md feature list with preamble and discuss buttons
- update site URL to
https://littlebearapps.com/tools/untether/
v0.23.5 (2026-02-27)
changes
- enrich error reporting in Telegram messages and structlog across all engines #14
- Claude errors now show session ID, resumed/new status, turn count, cost, and API duration
- non-zero exit codes show signal name (e.g.
SIGTERMfor rc=-15) and captured stderr excerpt - stream-ended-without-result errors include session context
runner.completedstructlog includesnum_turns,total_cost_usd,duration_api_ms
- compact startup message formatting with hard breaks #14
docs
- comprehensive documentation audit and upgrade #13
- add how-to guides: interactive approval, plan mode, cost budgets, webhooks & cron
- expand schedule-tasks guide with cron and webhook trigger coverage
- remove orphaned
docs/user-guide.mdredirect stub - fix stale version reference (0.19.0 → 0.23.4) in install tutorial and llms-full.txt
- regenerate
llms.txtandllms-full.txtwith 18 previously missing doc pages - add AI IDE context files:
AGENTS.md,.cursorrules,.github/copilot-instructions.md - update
.codex/AGENTS.mdwith correct project commands - add
ROADMAP.mdwith near/mid/future directional plans - update README documentation section with new guide links
- update
zensical.tomlnav with new how-to guides
v0.23.4 (2026-02-26)
fixes
- fix
test_doctor_voice_checksenv var leak from pydantic_settings #12UntetherSettings.model_validate()auto-loadsUNTETHER__*env vars, causingvoice_transcription_api_keyto leak into test- added
monkeypatch.delenv()for the pydantic_settings env var before constructing test settings
docs
- add macOS Keychain credential info to install tutorial, troubleshooting guide, and command reference #7
v0.23.3 (2026-02-26)
fixes
- add
rate_limit_eventto Claude stream-json schema (CLI v2.1.45+) #8- new
StreamRateLimitMessageandRateLimitInfomsgspec structs - event is decoded cleanly and silently skipped (informational only)
- eliminates noisy
jsonl.msgspec.invalidwarning in logs
- new
v0.23.2 (2026-02-26)
fixes
- fix crash when Claude OAuth credentials file missing (macOS Keychain, API key auth) #7
_maybe_append_usage_footer()now catchesFileNotFoundErrorandhttpx.HTTPStatusError- post-run messages are delivered to Telegram even when usage data is unavailable
- add macOS Keychain support for
/usagecommand and subscription usage footer #7- on macOS, Claude Code stores OAuth credentials in the Keychain, not on disk
_read_access_token()now tries the file first, then falls back to macOS Keychain
v0.23.1 (2026-02-26)
changes
- restructure startup message: one field per line, always show all status fields
- list project names instead of count
- always show mode, topics, triggers, resume lines, voice, and files status
- add voice and files enabled/disabled status
- update PyPI description and keywords to reflect current feature set
v0.23.0 (2026-02-26)
changes
- refresh startup message: dog emoji, version number, conditional diagnostics, project count
- only shows mode/topics/triggers/engines lines when they carry signal
- removes
resume lines:field (config detail, not actionable)
- add model + permission mode footer on final messages (
🏷 sonnet · plan)- all 4 engines (Claude, Codex, OpenCode, Pi) populate
StartedEvent.metawith model info - Claude also includes
permissionModefromsystem.init - Codex/OpenCode use runner config since their JSONL streams don’t include model metadata
- all 4 engines (Claude, Codex, OpenCode, Pi) populate
- route telegram callback queries to command backends #116
- callback data format:
command_id:args...routes to registered command plugins - extracts
message_thread_idfrom callback for proper topic context - enables plugins to build interactive UX with inline keyboards
- callback data format:
v0.22.2 (2026-02-25)
fixes
- remove defunct Telegram notification scripts that caused CI/release workflows to report failure #9
- skip
uuid.uuid7test on Python < 3.14 (only available in 3.14+) #10 - fix PyPI metadata: PEP 639 SPDX license, absolute doc links, remove deprecated classifier #11
v0.22.1 (2026-02-10)
fixes
- preserve ordered list numbering when nested list indentation is malformed in telegram render output #202
v0.22.0 (2026-02-10)
changes
- support Codex
phasevalues and unknown action kinds in commentary rendering #201
v0.21.5 (2026-02-08)
fixes
- dedupe redelivered telegram updates to prevent duplicate runs in DMs #198
changes
- read package version from metadata instead of a hardcoded
__version__constant
docs
- rotate telegram invite link
v0.21.4 (2026-01-22)
changes
- add allowed user gate to telegram #179
v0.21.3 (2026-01-21)
fixes
- ignore implicit topic root replies in telegram #175
v0.21.2 (2026-01-20)
fixes
- clear chat sessions on cwd change #172
docs
- add untether-slack plugin to reference #168
v0.21.1 (2026-01-18)
fixes
docs
- align engine terminology in telegram and docs #162
- add untether-discord plugin to plugins reference #164
v0.21.0 (2026-01-16)
changes
- add
untether configsubcommand #153 - make telegram /ctx work everywhere #159
- improve telegram command planning and testability #158
- simplify telegram loop and jsonl runner #155
- refactor telegram schemas and parsing with msgspec #156
tests
- improve coverage and raise threshold to 80% #154
- stabilize mutmut runs and extend telegram coverage #157
docs
- add opengraph meta fallbacks #150
v0.20.0 (2026-01-15)
changes
- add telegram mentions-only trigger mode #142
- add telegram /model and /reasoning overrides #147
- coalesce forwarded telegram messages #146
- export plugin utilities for transport development #137
fixes
- handle forwarded uploads for telegram #149
- preserve directives for voice transcripts #141
- resolve claude.cmd via shutil.which on windows #124
docs
- add untether-scripts plugin to plugins list #140
v0.19.0 (2026-01-15)
changes
- overhaul onboarding with persona-based setup flows #132
- add queued cancel placeholder for Telegram runs #136
- prefix Telegram voice transcriptions for agent awareness #135
docs
- refresh onboarding docs with new widgets and hero flow #138
- fix docs site mobile layout and font consistency #139
- link to untether.dev docs site
v0.18.0 (2026-01-13)
changes
- add per-chat and per-topic default agent via
/agent setcommand #109 - add session resume shorthand for pi runner #113
- expose
sender_idandrawfields onMessageReffor plugins #112
fixes
- recreate stale topic bindings when topic is deleted and recreated #127
- use stdout session header for pi runner #126
docs
v0.17.1 (2026-01-12)
fixes
- fix telegram /new command crash #106
- track telegram sessions for plugin runs #107
- align telegram prompt upload resume flow #105
v0.17.0 (2026-01-12)
changes
- add chat session mode (
session_mode = "chat") for auto-resume per chat without replying, reset with/new#102 - add
message_overflow = "split"to send long responses as multiple messages instead of trimming #101 - add
show_resume_lineoption to hide resume lines when auto-resume is available #100 - add
auto_put_mode = "prompt"to start a run with the caption after uploading a file #97 - expose
thread_idto plugins via run context #99 - use tomli-w for config serialization #103
- add
voice_transcription_modelsetting for local whisper servers #98
docs
- document chat sessions, message overflow, and voice transcription model settings
v0.16.0 (2026-01-12)
fixes
- harden telegram file transfer handling #84
changes
docs
- add tips section to user guide
- rework readme
v0.15.0 (2026-01-11)
changes
- add telegram file transfer support #83
docs
- document telegram file transfers #83
v0.14.1 (2026-01-10)
changes
- add topic scope and thread-aware replies for telegram topics #81
docs
- update telegram topics docs and user guide for topic scoping #81
v0.14.0 (2026-01-10)
changes
- add telegram forum topics support with
/topiccommand for binding threads to projects/branches, persistent resume tokens per topic, and/ctxfor inspecting or updating bindings #80 - add inline cancel button to progress messages #79
- add config hot-reload via watchfiles #78
docs
- add user guide and telegram topics documentation #80
v0.13.0 (2026-01-09)
changes
- add per-project chat routing #76
fixes
docs
- normalize casing in the readme and changelog
v0.12.0 (2026-01-09)
changes
- add optional telegram voice note transcription (routes transcript like typed text) #74
fixes
- fix plugin allowlist matching and windows session paths #72
docs
- document telegram voice transcription settings #74
v0.11.0 (2026-01-08)
changes
- add entrypoint-based plugins for engines/transports plus a
untether pluginscommand and public API docs #71
fixes
v0.10.0 (2026-01-08)
changes
- add transport registry with
--transportoverrides and auntether transportscommand #69 - migrate config loading to pydantic-settings and move telegram credentials under
[transports.telegram]#65 - include project aliases in the telegram slash-command menu with validation and limits #67
fixes
- validate worktree roots instead of treating nested paths as worktrees #63
- harden onboarding with clearer config errors, safe backups, and refreshed command menu wording #70
docs
- add architecture and lifecycle diagrams
- call out the default worktrees directory #64
- document the transport registry and onboarding changes #69
v0.9.0 (2026-01-07)
projects and worktrees
- register repos with
untether init <alias>and target them via/projectdirectives - route runs to git worktrees with
@branch— untether resolves or creates worktrees automatically - replies preserve context via
ctx: project @branchfooters, no need to repeat directives - set
default_projectto skip the/projectprefix entirely - per-project
default_engineandworktree_baseconfiguration
changes
- transport/presenter protocols plus transport-agnostic
exec_bridge - move telegram polling + wiring into
untether.telegramwith transport/presenter adapters - list configured projects in the startup banner
fixes
- render
ctx:footer lines consistently (backticked + hard breaks) and include them in final messages
breaking
- remove
untether.bridge; useuntether.runner_bridgeanduntether.telegraminstead
docs
- add a projects/worktrees guide and document
untether initbehavior in the readme
v0.8.0 (2026-01-05)
changes
- queue telegram requests with rate limits and retry-after backoff #54
docs
- improve documentation coverage #52
- align runner guide with factory pattern
- add missing pr links in the changelog
v0.7.0 (2026-01-04)
changes
- migrate logging to structlog with structured pipelines and redaction #46
- add msgspec schemas for jsonl decoding across runners #37
v0.6.0 (2026-01-03)
changes
- interactive onboarding: run
untetherto set up bot token, chat id, and default engine via guided prompts #39 - lockfile to prevent multiple untether instances from racing the same bot token #30
- re-run onboarding anytime with
untether --onboard
v0.5.3 (2026-01-02)
changes
- default claude allowed tools to
["Bash", "Read", "Edit", "Write"]when not configured #29
v0.5.2 (2026-01-02)
changes
- show not installed agents in the startup banner (while hiding them from slash commands)
fixes
- treat codex reconnect notices as non-fatal progress updates instead of errors #27
- avoid crashes when codex tool/file-change events omit error fields #27
v0.5.1 (2026-01-02)
changes
- relax telegram ACL to check chat id only, enabling use in group chats and channels #26
- improve onboarding documentation and add tests #25
v0.5.0 (2026-01-02)
changes
- add an opencode runner via the
opencodecli with json event parsing and resume support #22 - add a pi agent runner via the
picli with jsonl streaming and resume support #24 - document the opencode and pi runners, event mappings, and stream capture tips
fixes
- fix path relativization so progress output does not strip sibling directories #23
- reduce noisy debug logging from markdown_it/httpcore
v0.4.0 (2026-01-02)
changes
- add auto-router runner selection with configurable default engine #15
- make auto-router the default entrypoint; subcommands or
/{engine}prefixes override for new threads - add
/cancel+/{engine}command menu sync on startup - show engine name in progress and final message headers
- omit progress/action log lines from final output for cleaner answers #21
fixes
- improve codex exec error rendering with stderr extraction #18
- preserve markdown formatting and resume footer when trimming long responses #20
v0.3.0 (2026-01-01)
changes
- add a claude code runner via the
claudecli with stream-json parsing and resume support #9 - auto-discover engine backends and generate cli subcommands from the registry #12
- add
BaseRunnersession locking plus aJsonlSubprocessRunnerhelper for jsonl subprocess engines - add jsonl stream parsing and subprocess helpers for runners
- lazily allocate per-session locks and streamline backend setup/install metadata
- improve startup message formatting and markdown rendering
- add a debug onboarding helper for setup troubleshooting
breaking
- runner implementations must define explicit resume parsing/formatting (no implicit standard resume pattern)
fixes
- stop leaking a hidden
engine-idcli option on engine subcommands
docs
- add a runner guide plus claude code docs (runner, events, stream-json cheatsheet)
- clarify the claude runner file layout and add guidance for jsonl-based runners
- document “minimal” runner mode: started+completed only, completed-only actions allowed
v0.2.0 (2025-12-31)
changes
- introduce runner protocol for multi-engine support #7
- normalized event model (
started,action,completed) - actions with stable ids, lifecycle phases, and structured details
- engine-agnostic bridge and renderer
- normalized event model (
- add
/cancelcommand with progress message targeting #4 - migrate async runtime from asyncio to anyio #6
- stream runner events via async iterators (natural backpressure)
- per-thread job queues with serialization for same-thread runs
- render resume as
codex resume <token>command lines - various rendering improvements including file edits
breaking
- require python 3.14+
- remove
--profileflag; configure via[codex].profileonly
fixes
- serialize new sessions once resume token is known
- preserve resume tokens in error renders #3
- preserve file-change paths in action events #2
- terminate codex process groups on cancel (posix)
- correct resume command matching in bridge
v0.1.0 (2025-12-29)
features
- telegram bot bridge for openai codex cli via
codex exec - stateless session resume via
`codex resume <token>`lines - real-time progress updates with ~2s throttling
- full markdown rendering with telegram entities (markdown-it-py + sulguk)
- per-session serialization to prevent race conditions
- interactive onboarding guide for first-time setup
- codex profile configuration
- automatic telegram token redaction in logs
- cli options:
--debug,--final-notify,--version