Troubleshooting
Common issues and fixes for Untether. If your agent isn't responding, messages aren't arriving, or something looks off — start here.
Common issues and fixes for Untether. If your agent isn’t responding, messages aren’t arriving, or something looks off — start here.
Quick diagnostics
Before diving into specific issues, run these two commands:
untether --debug # start with debug logging → writes debug.log
untether doctor # preflight check: token, chat, topics, files, voice, engines
$ untether doctor
✓ bot token valid (@my_untether_bot)
✓ chat 123456789 reachable
✓ engine codex found at /usr/local/bin/codex
✓ engine claude found at /usr/local/bin/claude
✗ engine opencode not found
✓ voice transcription configured
✓ file transfer directory exists
Bot not responding
Symptoms: You send a message but the bot doesn’t reply at all.
- Check that Untether is running:
- Terminal: Look at the terminal where you ran
untether— is it still running? - Linux (systemd):
systemctl --user status untether
- Terminal: Look at the terminal where you ran
- Verify your bot token:
untether doctorwill flag an invalid token - Check
allowed_user_ids— if set, only listed users can interact. An empty list means everyone is allowed. - In a group chat, check trigger mode: if set to
mentions, you must @mention the bot - Make sure you’re messaging the correct bot (not a different one)
Engine CLI not found
Symptoms: “codex: command not found” or similar error after sending a task.
The engine CLI isn’t on your PATH. Install the engine you need:
# Codex
npm install -g @openai/codex
# Claude Code
npm install -g @anthropic-ai/claude-code
# OpenCode
npm install -g opencode-ai@latest
# Pi
npm install -g @mariozechner/pi-coding-agent
# Gemini CLI
npm install -g @google/gemini-cli
# Amp
npm install -g @sourcegraph/amp
Verify with which codex (or which claude, etc.). If installed via npm -g but not found, check that npm’s global bin directory is in your PATH.
Run untether doctor to see which engines are detected.
Permission denied or auth errors
Symptoms: Engine starts but fails with authentication or permission errors.
- Codex: Run
codexin a terminal and sign in with your ChatGPT account - Claude Code: Run
claude loginto authenticate. On macOS, credentials are stored in Keychain; on Linux, in~/.claude/.credentials.json - OpenCode: Run
opencodeand authenticate with your chosen provider - Pi: Run
piand log in with your provider - Gemini CLI: Run
geminiand authenticate with your Google account - Amp: Run
ampand sign in with your Sourcegraph account
Progress stuck on “starting”
Symptoms: The progress message shows “starting” but never updates.
- The engine might be doing a slow first-time setup (repo indexing, dependency install). Wait 30-60 seconds.
- If it persists,
/cancel(reply to the progress message) and try a more specific prompt - Check
debug.log— the engine may have errored silently - Verify the engine works standalone: run
codex "hello"(or equivalent) directly in a terminal
Engine hangs in headless mode
Symptoms: The engine starts but produces no output, eventually triggering stall warnings. Common with Codex and OpenCode when the engine needs user input (approval or question) but has no terminal to display it.
Codex: approval hang
Codex may block waiting for terminal approval in headless mode if no --ask-for-approval flag is passed. Fix: upgrade to Untether v0.35.0+ which always passes --ask-for-approval never (or untrusted in safe permission mode). Older versions may not pass this flag, causing Codex to use its default terminal-based approval flow.
OpenCode: unsupported event warning
If OpenCode emits a JSONL event type that Untether doesn’t recognise (e.g. a question or permission event from a newer OpenCode version), Untether v0.35.0+ shows a visible warning in Telegram: “opencode emitted unsupported event: {type}”. In older versions, these events were silently dropped, leaving the user with no feedback until the stall watchdog fired.
If you see this warning, check for an Untether update that adds support for the new event type. OpenCode’s run command auto-denies questions via permission rules, so this should be rare — it most likely indicates an OpenCode protocol change.
Engine output line cap
Individual engine stdout lines are capped at 10 MB. If an engine emits a single JSONL line exceeding this limit (e.g. a very large base64 image in a tool result), the line is truncated and a warning is logged. This prevents unbounded memory growth from malformed engine output.
Stall warnings
Symptoms: Telegram shows ”⏳ No progress for X min — session may be stuck” or ”⏳ MCP tool running: server-name (X min)”.
The stall watchdog monitors engine subprocesses for periods of inactivity (no JSONL events on stdout). Thresholds vary by context:
| Context | Threshold | Example |
|---|---|---|
| Normal (thinking/generation) | 5 min | Model is generating a response |
| Local tool running (Bash, Read, etc.) | 10 min | Long test suite or build |
| MCP tool running | 15 min | External API call (Cloudflare, GitHub, web search) |
| Pending user approval | 30 min | Waiting for Approve/Deny click |
If the warning names an MCP tool (e.g. “MCP tool running: cloudflare-observability”), the process is likely waiting on a slow external API. This is usually not a real stall — wait for it to complete or /cancel if it’s taking too long.
If the warning says “MCP tool may be hung”, the MCP tool has been running with no new events for an extended period (3+ stall checks with a frozen event buffer). This usually means the MCP server is stuck in an internal retry loop. Use /cancel and retry with a more targeted prompt.
If the warning says “CPU active, no new events”, the process is using CPU but hasn’t produced any new JSONL events for 3+ stall checks. This can happen when Claude Code is stuck in a long API call, extended thinking, or an internal retry loop. Use /cancel if the silence persists.
If the warning says “Bash command still running (X min)”, Claude Code is waiting for a long-running tool subprocess (benchmark, build, test suite). This warning fires once when the tool exceeds the threshold (10 min by default). While the child process is actively consuming CPU, repeat warnings are suppressed — you won’t see the same message every 3 minutes. If the child process stops consuming CPU, warnings resume with “tool may be stuck”.
If the warning says “X tool may be stuck (N min, no CPU activity)”, the tool subprocess has stopped consuming CPU, suggesting it may be genuinely stuck (e.g. a hung curl, a network timeout, a deadlock). Use /cancel and resume, asking Claude to skip the hung command.
If the warning says “session may be stuck”, the process may genuinely be stalled. Check:
- Look at the diagnostics in the message — CPU active, TCP connections, RSS
- If CPU is active and TCP connections exist, the process is likely still working
- If CPU is idle and no TCP connections, the process may be truly stuck — use
/cancel
Tuning: All thresholds are configurable via [watchdog] in untether.toml. Use tool_timeout to increase the initial threshold for local tools (default 10 min), and mcp_tool_timeout for MCP tools (default 15 min). See the config reference.
Claude Code hangs after an MCP tool_result
Symptoms: Claude Code goes silent immediately after an MCP tool returns — the tool_result arrives in the JSONL stream but the assistant never responds. Ring buffer fills with user/tool_result events and stays there. Often hits Cloudflare’s remote MCP servers via mcp-remote.
Root cause is upstream — claude-code#39700 combined with undici’s idle-body timeout in mcp-remote (geelen/mcp-remote#226) — but Untether ships an opt-in detector plus a tiered workaround (#322).
Enable in ~/.untether/untether.toml:
[watchdog]
detect_stuck_after_tool_result = true
On detection (default 5 min after tool_result arrives with no assistant follow-up), Untether logs progress_edits.stuck_after_tool_result, SIGTERMs any mcp-remote / @modelcontextprotocol adapter children to force the SSE reader to error out, and finally cancels the run if the engine stays silent for another 60 seconds. Tune via stuck_after_tool_result_timeout and stuck_after_tool_result_recovery_delay. See the config reference.
Claude Code exits without finishing (auto-continue)
Symptoms: Claude Code exits after receiving tool results without processing them. You see “⚠️ Auto-continuing” in the chat, or the session ends prematurely with no final answer.
This is an upstream Claude Code bug (#34142, #30333). Untether detects it automatically and resumes the session.
How it works: Normal sessions end with last_event_type=result. When Claude Code exits with last_event_type=user (tool results sent but never processed), Untether sends a “⚠️ Auto-continuing” notification and resumes the session.
If auto-continue keeps firing:
- Check if the upstream bug is fixed in a newer Claude Code version:
npm i -g @anthropic-ai/claude-code@latest - Disable auto-continue if it causes issues: set
enabled = falsein[auto_continue] - Increase max retries if a single retry isn’t enough: set
max_retries = 2(max 5)
Auto-continue is suppressed for signal deaths (rc=143/SIGTERM, rc=137/SIGKILL) to prevent death spirals under memory pressure. See the config reference.
Messages too long or truncated
Symptoms: The bot’s response is cut off or split across multiple messages.
Telegram messages have a 4096-character limit. Untether handles this automatically:
- Split mode (default): Long responses are split across multiple messages (~3500 chars each)
- Trim mode: Single message, truncated to fit
To change:
=== “untether config”
```sh
untether config set transports.telegram.message_overflow "trim"
```
=== “toml”
```toml title="~/.untether/untether.toml"
[transports.telegram]
message_overflow = "trim" # or "split" (default)
```
Voice transcription not working
Symptoms: Sending a voice note doesn’t start a run, or you get a transcription error.
-
Check that voice transcription is enabled:
[transports.telegram] voice_transcription = true -
Make sure you have an OpenAI API key set (voice transcription uses the OpenAI transcription API by default)
-
Check the voice note size — default max is 10 MiB (
voice_max_bytes) -
If using a custom transcription server, verify
voice_transcription_base_urlis reachable
Run untether doctor to validate voice configuration.
File transfer blocked
Symptoms: /file put or /file get fails, or dropped documents aren’t saved.
-
Check that file transfer is enabled:
[transports.telegram.files] enabled = true -
Check
deny_globs— files matching these patterns are blocked (default:.git/**,.env,*.pem,.ssh/**) -
In group chats, file transfer requires admin or creator status (unless
files.allowed_user_idsis set) -
Check the
uploads_dirpath exists relative to the project root
Topics not appearing
Symptoms: /topic doesn’t work, or topics aren’t binding to projects.
-
Topics require a forum-enabled supergroup (not a private chat or regular group)
-
The bot must be admin with “Manage Topics” permission
-
Topics must be enabled in config:
[transports.telegram.topics] enabled = true scope = "auto" # or "main", "projects", "all" -
Run
untether doctor— it checks topic permissions
Webhook not receiving events
Symptoms: Webhooks are configured but never fire.
- Check that triggers are enabled:
[triggers] enabled = true - Verify the server is running:
curl http://127.0.0.1:9876/health(adjust host/port) - Port already in use? As of #320, a port conflict degrades gracefully — the rest of the bot (polling, commands, crons) stays up, but webhook delivery is disabled. Look for
triggers.server.bind_failedin the log (journalctl --user -u untether \| grep bind_failed); the entry includes the occupied port and afixsuggestion. Free the port or set[triggers.server] port = <N>inuntether.toml. - Check auth — if using HMAC, the sending service must sign requests with the same secret
- Check
event_filter— if set, only matching event types are processed - Check firewall rules if the webhook server is behind NAT
- Look at
debug.logfor incoming request logs
Config change didn’t take effect
Symptoms: You edited untether.toml but the change doesn’t seem to apply.
- Check
watch_config: Hot-reload requireswatch_config = truein the top-level config. Without it, changes only apply on restart. - Hot-reloadable settings apply immediately:
voice_transcription,[files],allowed_user_ids,show_resume_line, trigger crons/webhooks/auth/timezones. - Restart-only settings require
/restartorsystemctl restart:bot_token,chat_id,session_mode,topics.enabled,message_overflow,triggers.server.host/port. Editing one of these in a running bot triggers a Telegram 🔄 warning to every project chat plus anyallowed_user_idsadmin DM (#318) so you won’t silently keep running on the stale value. - Check the log for
config.reload.applied(success),config.reload.transport_config_changed restart_required=True(restart needed), orconfig.reload.restart_notify.sent(Telegram warning broadcast).
/at delay not firing
Symptoms: You scheduled /at 30m Check the build but the prompt never runs.
- Pending
/atdelays are held in memory — they are lost on restart. If Untether restarted after you scheduled, the delay was cancelled. - Use
/cancelto see how many pending delays exist. If it says “nothing running”, there are no pending delays. - Minimum duration: 60 seconds. Maximum: 24 hours. Values outside this range are rejected.
- Per-chat cap: 20 pending delays. The 21st is rejected with an error message.
Session not resuming
Symptoms: Sending a follow-up message starts a new session instead of continuing.
- Chat mode (
session_mode = "chat"): Just send another message — it auto-resumes. Use/newto start fresh. - Stateless mode (
session_mode = "stateless"): You must reply to a message that contains a resume token. Plain messages start new sessions. - If resume fails silently, the previous session may have been corrupted. Untether auto-clears broken resume tokens (0-turn sessions).
Claude Code plugin interference
Symptoms: Agent completes successfully but the response is about “hooks”, “context docs”, or “false positive” instead of the content you actually asked for. The run shows done with a short answer that doesn’t match your request.
This happens when Claude Code plugins with Stop hooks consume the final response. In a terminal, the user can scroll up to see earlier output. In Telegram, only the final message is visible — so if a Stop hook causes Claude to address hook concerns in its last turn, the actual content is replaced.
Affected plugins: Any Claude Code plugin that uses "decision": "block" in a Stop hook. The most common example is PitchDocs context-guard, which nudges Claude to update AI context docs when structural files change.
Fix:
-
Update the plugin — PitchDocs v1.20+ checks for
$UNTETHER_SESSIONand automatically skips blocking Stop hooks in Telegram sessions. Run/pitchdocs:context-guard installin your project to update the hooks. -
Verify
UNTETHER_SESSIONis set — Untether v0.34.4+ setsUNTETHER_SESSION=1in the Claude runner subprocess environment. If you’re on an older version, upgrade:pipx upgrade untether -
For custom plugins — add this to your Stop hook script:
[ -n "${UNTETHER_SESSION:-}" ] && echo '{}' && exit 0
This is not a security concern — UNTETHER_SESSION is a simple signal variable that tells plugins the session is running via Telegram. See the interference audit for a detailed case study.
Cost budget blocking runs
Symptoms: “Budget exceeded” message, or runs are cancelled mid-stream.
-
Check your budget settings:
[cost_budget] enabled = true max_cost_per_run = 2.00 # USD per run max_cost_per_day = 20.00 # USD per day auto_cancel = true # cancels runs exceeding per-run limit -
Daily budgets reset at midnight UTC
-
To temporarily bypass: set
enabled = falseor increase the limits -
Check current spend with
/usage
Group chat: bot ignoring messages
Symptoms: Bot works in private chat but ignores messages in a group.
- Check trigger mode: groups default to
mentionsin many setups. Send/triggerto check, or/trigger allto respond to everything. - Check bot privacy mode in BotFather: send
/setprivacyto @BotFather and select your bot. Set to “Disable” so the bot can see all messages (not just commands and @mentions). - Check
allowed_user_ids— if set, group members not in the list are ignored. - If using topics, make sure the bot has “Manage Topics” permission.
macOS and Linux credential differences
| Platform | Claude Code credentials | Path |
|---|---|---|
| Linux | Plain-text JSON file | ~/.claude/.credentials.json |
| macOS | macOS Keychain | Entry: Claude Code-credentials |
Untether checks both locations automatically. If you’ve recently changed platforms or reinstalled, run claude login to refresh credentials.
Using debug mode
Start Untether with --debug for full diagnostic logging:
untether --debug
This writes to debug.log in the current directory. The log includes:
- Engine JSONL events (every line the subprocess emits)
- Telegram API requests and responses
- Rendered message content
- Error tracebacks
Include debug.log when reporting issues on GitHub.
Using untether doctor
Run untether doctor for a comprehensive preflight check:
untether doctor
It validates:
- Telegram bot token (connects and verifies)
- Chat ID (reachable)
- Topics configuration (permissions, forum group status)
- File transfer settings (deny globs, permissions)
- Voice transcription configuration (API reachability)
- Engine CLI availability (on PATH)
$ untether doctor
✓ bot token valid (@my_untether_bot)
✓ chat 123456789 reachable
✓ engine codex found at /usr/local/bin/codex
✓ engine claude found at /usr/local/bin/claude
✓ engine opencode found at /usr/local/bin/opencode
✓ voice transcription configured
✓ file transfer directory exists
all checks passed
Checking logs
=== “Terminal (all platforms)”
Untether logs to the terminal by default. For detailed logs:
```sh
untether --debug # writes debug.log in current directory
```
=== “Linux (systemd)”
```sh
journalctl --user -u untether -f # live logs
journalctl --user -u untether -n 100 # last 100 lines
journalctl --user -u untether -b # since last boot
```
Look for handle.worker_failed, handle.runner_failed, or config.read.toml_error entries.
Key log events
| Event | Level | Meaning |
|---|---|---|
handle.worker_failed | ERROR | Engine run crashed |
handle.runner_failed | ERROR | Runner subprocess failed |
config.read.toml_error | ERROR | Config file couldn’t be parsed |
footer_settings.load_failed | WARNING | Footer config fell back to defaults |
watchdog_settings.load_failed | WARNING | Watchdog config fell back to defaults |
auto_continue_settings.load_failed | WARNING | Auto-continue config fell back to defaults |
preamble_settings.load_failed | WARNING | Preamble config fell back to defaults |
outline_cleanup.delete_failed | WARNING | Stale plan outline message couldn’t be deleted |
handle.engine_resolved | INFO | Engine and CWD successfully resolved for a run |
file_transfer.saved | INFO | File uploaded and written to disk |
file_transfer.denied | WARNING | File transfer blocked (permissions, deny glob) |
message.dropped | DEBUG | Message from unrecognised chat silently dropped |
cost_budget.exceeded | ERROR | Run or daily cost exceeded budget |
All logs include session_id once a session starts, enabling per-session filtering with grep or jq.
Telegram bot tokens, OpenAI API keys (sk-...), and GitHub tokens (ghp_, ghs_, github_pat_) are automatically redacted in all log output.
Error hints
When an engine fails, Untether scans the error message and shows an actionable recovery hint above the raw error. The raw error is wrapped in a code block for visual separation. Hints are case-insensitive and pattern-matched — the first match wins. Your session is automatically saved in most cases, so you can resume after resolving the issue.
Untether recognises 67 error patterns across 14 categories:
| Category | Examples | Engines |
|---|---|---|
| Authentication | API key missing/invalid, token refresh, login required | All |
| Subscription & billing | Usage limits, quota exceeded, billing hard limit | Claude, Codex, OpenCode, Gemini |
| API overload & server | 500/502/503/504, overloaded | All |
| Rate limits | Rate limited, too many requests | All |
| Model errors | Model not found, invalid model | All |
| Context length | Context too long, max tokens exceeded | Claude, Codex, OpenCode |
| Content safety | Content filter, safety block, prompt blocked | Claude, Gemini |
| Invalid request | Malformed API request | Claude, Codex |
| Network & SSL | DNS, timeout, connection refused, certificate errors | All |
| CLI & filesystem | Command not found, disk full, permission denied | All |
| Signals | SIGTERM, SIGKILL, SIGABRT | All |
| Process & session | No result event, no session ID, execution errors | All |
| Engine-specific | AMP credits/login, Gemini result status | AMP, Gemini |
| Account & proxy | Account suspended, proxy auth, request timeout | All |
For the full list of patterns and hints, see the Error Reference.
Related
- Operations and monitoring —
/ping,/restart, hot-reload - Configuration reference — all config options
- Commands & directives — full command reference