diff --git a/CHANGES b/CHANGES index b2d54f4d..1eb4cad5 100644 --- a/CHANGES +++ b/CHANGES @@ -6,6 +6,12 @@ _Notes on upcoming releases will be added here_ +### What's new + +**Incremental pane observation with {tooliconl}`capture-since`** + +{tooliconl}`capture-since` gives agents a cursor-based way to observe a pane without re-reading the same terminal output on every turn. The first call returns the current visible screen and an opaque cursor; later calls return only rows written or rewritten after that cursor while tmux still retains the needed history. If scrollback was cleared or trimmed, the result sets `lines_missed=True`, returns a conservative current visible capture, and issues a fresh cursor. Malformed cursors, cross-pane replay, pane death, and pane respawn fail clearly instead of silently switching processes. (#60) + ## libtmux-mcp 0.1.0a9 (2026-05-24) libtmux-mcp 0.1.0a9 tightens pane polling correctness for agents waiting on terminal output. Search and wait tools now handle wrapped content, history-limit risk reporting, and pane lifecycle changes with clearer results instead of silent false positives. diff --git a/README.md b/README.md index b6093784..077477ab 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ Give your AI agent hands inside the terminal — create sessions, run commands, | **Server** | `list_sessions`, `create_session`, `kill_server`, `get_server_info` | | **Session** | `list_windows`, `get_session_info`, `create_window`, `rename_session`, `select_window`, `kill_session` | | **Window** | `list_panes`, `get_window_info`, `split_window`, `rename_window`, `select_layout`, `resize_window`, `move_window`, `kill_window` | -| **Pane** | `send_keys`, `paste_text`, `capture_pane`, `snapshot_pane`, `search_panes`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | +| **Pane** | `send_keys`, `paste_text`, `capture_pane`, `capture_since`, `snapshot_pane`, `search_panes`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | | **Options** | `show_option`, `set_option` | | **Environment** | `show_environment`, `set_environment` | @@ -99,6 +99,11 @@ returns content, cursor, copy-mode state, and scroll offset as one typed value. The alternative is several `tmux` invocations stitched together with regex. +**Observing.** [`capture_since`](https://libtmux-mcp.git-pull.com/tools/pane/capture-since/) +returns a cursor with the current pane content, then returns only +newly written or rewritten rows on follow-up calls. The alternative is +re-sending the same scrollback to the model on every check. + **Guarding.** The server detects the agent's own pane across sockets and declines self-destructive operations — [`kill_session`](https://libtmux-mcp.git-pull.com/tools/session/kill-session/) on itself fails loudly instead of silently terminating the host diff --git a/docs/demo.md b/docs/demo.md index a107f0fb..3fd42394 100644 --- a/docs/demo.md +++ b/docs/demo.md @@ -18,31 +18,31 @@ Standalone badges via `{badge}`: ### `{tool}` — code-linked with badge -{tool}`capture-pane` · {tool}`send-keys` · {tool}`search-panes` · {tool}`wait-for-text` · {tool}`kill-pane` · {tool}`create-session` · {tool}`split-window` +{tool}`capture-pane` · {tool}`capture-since` · {tool}`send-keys` · {tool}`search-panes` · {tool}`wait-for-text` · {tool}`kill-pane` · {tool}`create-session` · {tool}`split-window` ### `{toolref}` — code-linked, no badge -{toolref}`capture-pane` · {toolref}`send-keys` · {toolref}`search-panes` · {toolref}`wait-for-text` · {toolref}`kill-pane` · {toolref}`create-session` · {toolref}`split-window` +{toolref}`capture-pane` · {toolref}`capture-since` · {toolref}`send-keys` · {toolref}`search-panes` · {toolref}`wait-for-text` · {toolref}`kill-pane` · {toolref}`create-session` · {toolref}`split-window` ### `{tooliconl}` — icon left, outside code -{tooliconl}`capture-pane` · {tooliconl}`send-keys` · {tooliconl}`search-panes` · {tooliconl}`wait-for-text` · {tooliconl}`kill-pane` · {tooliconl}`create-session` · {tooliconl}`split-window` +{tooliconl}`capture-pane` · {tooliconl}`capture-since` · {tooliconl}`send-keys` · {tooliconl}`search-panes` · {tooliconl}`wait-for-text` · {tooliconl}`kill-pane` · {tooliconl}`create-session` · {tooliconl}`split-window` ### `{tooliconr}` — icon right, outside code -{tooliconr}`capture-pane` · {tooliconr}`send-keys` · {tooliconr}`search-panes` · {tooliconr}`wait-for-text` · {tooliconr}`kill-pane` · {tooliconr}`create-session` · {tooliconr}`split-window` +{tooliconr}`capture-pane` · {tooliconr}`capture-since` · {tooliconr}`send-keys` · {tooliconr}`search-panes` · {tooliconr}`wait-for-text` · {tooliconr}`kill-pane` · {tooliconr}`create-session` · {tooliconr}`split-window` ### `{tooliconil}` — icon inline-left, inside code -{tooliconil}`capture-pane` · {tooliconil}`send-keys` · {tooliconil}`search-panes` · {tooliconil}`wait-for-text` · {tooliconil}`kill-pane` · {tooliconil}`create-session` · {tooliconil}`split-window` +{tooliconil}`capture-pane` · {tooliconil}`capture-since` · {tooliconil}`send-keys` · {tooliconil}`search-panes` · {tooliconil}`wait-for-text` · {tooliconil}`kill-pane` · {tooliconil}`create-session` · {tooliconil}`split-window` ### `{tooliconir}` — icon inline-right, inside code -{tooliconir}`capture-pane` · {tooliconir}`send-keys` · {tooliconir}`search-panes` · {tooliconir}`wait-for-text` · {tooliconir}`kill-pane` · {tooliconir}`create-session` · {tooliconir}`split-window` +{tooliconir}`capture-pane` · {tooliconir}`capture-since` · {tooliconir}`send-keys` · {tooliconir}`search-panes` · {tooliconir}`wait-for-text` · {tooliconir}`kill-pane` · {tooliconir}`create-session` · {tooliconir}`split-window` ### `{ref}` — plain text link -{ref}`capture-pane` · {ref}`send-keys` · {ref}`search-panes` · {ref}`wait-for-text` · {ref}`kill-pane` · {ref}`create-session` · {ref}`split-window` +{ref}`capture-pane` · {ref}`capture-since` · {ref}`send-keys` · {ref}`search-panes` · {ref}`wait-for-text` · {ref}`kill-pane` · {ref}`create-session` · {ref}`split-window` ## Badges in context @@ -66,7 +66,7 @@ These are the actual tool headings as they render on tool pages: ### In prose -Use {tooliconl}`search-panes` to find text across all panes. If you know which pane, use {tooliconl}`capture-pane` instead. After running a command with {tooliconl}`send-keys`, compose `tmux wait-for -S` and call {tooliconl}`wait-for-channel` before capturing. +Use {tooliconl}`search-panes` to find text across all panes. If you know which pane, use {tooliconl}`capture-pane` for one read or {tooliconl}`capture-since` for repeated observation. After running a command with {tooliconl}`send-keys`, compose `tmux wait-for -S` and call {tooliconl}`wait-for-channel` before capturing. ### Dense inline (toolref, no badges) diff --git a/docs/index.md b/docs/index.md index da14db8a..3bc5a789 100644 --- a/docs/index.md +++ b/docs/index.md @@ -71,7 +71,7 @@ Config blocks for Claude Desktop, Claude Code, Cursor, and others. Read tmux state without changing anything. -{toolref}`list-sessions` · {toolref}`capture-pane` · {toolref}`snapshot-pane` · {toolref}`get-pane-info` · {toolref}`find-pane-by-position` · {toolref}`search-panes` · {toolref}`wait-for-text` · {toolref}`wait-for-content-change` · {toolref}`display-message` +{toolref}`list-sessions` · {toolref}`capture-pane` · {toolref}`capture-since` · {toolref}`snapshot-pane` · {toolref}`get-pane-info` · {toolref}`find-pane-by-position` · {toolref}`search-panes` · {toolref}`wait-for-text` · {toolref}`wait-for-content-change` · {toolref}`display-message` ### Act (mutating) diff --git a/docs/prompts.md b/docs/prompts.md index 00acb44f..40453e43 100644 --- a/docs/prompts.md +++ b/docs/prompts.md @@ -111,7 +111,9 @@ prefers {tool}`snapshot-pane`, which returns content + cursor position + pane mode + scroll state in one call — saving a follow-up ``get_pane_info`` round-trip. It also explicitly forbids the agent from acting before it has a hypothesis, which prevents -"fix the symptom" anti-patterns. +"fix the symptom" anti-patterns. For repeated observation, it routes +follow-up reads through {tool}`capture-since` cursors instead of full +pane captures. ```{fastmcp-prompt-input} diagnose_failing_pane ``` @@ -124,9 +126,12 @@ Something went wrong in tmux pane %1. Diagnose it: 1. Call `snapshot_pane(pane_id="%1")` to get content, cursor position, pane mode, and scroll state in one call. 2. If the content looks truncated, re-call with `max_lines=None`. -3. Identify the last command that ran (look at the prompt line and +3. If you need to watch the pane across more than one turn, call + `capture_since(pane_id="%1")`, keep the returned cursor, + and pass it to later `capture_since(cursor=...)` calls. +4. Identify the last command that ran (look at the prompt line and the line above it) and the last non-empty output line. -4. Propose a root cause hypothesis and a minimal command to verify +5. Propose a root cause hypothesis and a minimal command to verify it (do NOT execute anything yet — produce the plan first). ``` diff --git a/docs/quickstart.md b/docs/quickstart.md index 296bdc45..d555b5a6 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -57,6 +57,10 @@ When you say "run `make test` and show me the output", the agent executes a thre This **send → wait → capture** sequence is the fundamental workflow. For commands the agent authors, the channel pattern is deterministic; for output the agent does not author (third-party log lines, daemon prompts, interactive supervisors), substitute {tool}`wait-for-text` for step 2. +When you need to keep checking the same pane after that first read, switch to +{tool}`capture-since`: the first call returns a cursor, and follow-up calls +return only new pane output. + ## Next steps - {ref}`concepts` — Understand the tmux hierarchy and how tools target panes diff --git a/docs/recipes.md b/docs/recipes.md index ea926a75..987b13a6 100644 --- a/docs/recipes.md +++ b/docs/recipes.md @@ -215,9 +215,9 @@ the ones above), use {toolref}`wait-for-text` instead. {toolref}`wait-for-text` replaces `sleep`. The server might start in 2 seconds or 20 -- the agent adapts. The anti-pattern is polling with repeated -{toolref}`capture-pane` calls or hardcoding a sleep duration. The MCP server -handles the polling internally with configurable `timeout` (default 8s) and -`interval` (default 50ms). +{toolref}`capture-pane` calls or hardcoding a sleep duration. When the job is +already running and the agent needs to keep observing it across turns, use +{toolref}`capture-since` so each read returns only new pane output. --- @@ -483,4 +483,3 @@ wait instead of polling, content vs. metadata, prefer IDs, escalate gracefully -- see the {ref}`prompting guide `. For specific pitfalls like `enter: false` and the `send_keys`/`capture_pane` race condition, see {ref}`gotchas `. - diff --git a/docs/tools/index.md b/docs/tools/index.md index a9e153ec..007fec29 100644 --- a/docs/tools/index.md +++ b/docs/tools/index.md @@ -7,7 +7,8 @@ All tools accept an optional `socket_name` parameter for multi-server support. I ## Which tool do I want? **Reading terminal content?** -- Know which pane? → {tool}`capture-pane` +- Re-reading the same pane while it changes? → {tool}`capture-since` +- Need a one-shot read of a known pane? → {tool}`capture-pane` - Need text + cursor + mode in one call? → {tool}`snapshot-pane` - Don't know which pane? → {tool}`search-panes` - Need to wait for specific output? → {tool}`wait-for-text` @@ -99,6 +100,12 @@ List panes in a window. Read visible content of a pane. ::: +:::{grid-item-card} capture_since +:link: capture-since +:link-type: ref +Start a cursor, then read only new pane output. +::: + :::{grid-item-card} get_pane_info :link: get-pane-info :link-type: ref diff --git a/docs/tools/pane/capture-pane.md b/docs/tools/pane/capture-pane.md index 432f9cbe..757bdcf4 100644 --- a/docs/tools/pane/capture-pane.md +++ b/docs/tools/pane/capture-pane.md @@ -7,8 +7,9 @@ after running a command, checking output, or verifying state. **Avoid when** you need to search across multiple panes at once — use -{tooliconl}`search-panes`. If you only need pane metadata (not content), use -{tooliconl}`get-pane-info`. +{tooliconl}`search-panes`. If you are repeatedly watching the same pane, use +{tooliconl}`capture-since` with its cursor so unchanged scrollback is not sent +again. If you only need pane metadata (not content), use {tooliconl}`get-pane-info`. **Side effects:** None. Readonly. diff --git a/docs/tools/pane/capture-since.md b/docs/tools/pane/capture-since.md new file mode 100644 index 00000000..27e17ac1 --- /dev/null +++ b/docs/tools/pane/capture-since.md @@ -0,0 +1,92 @@ +# Capture since + +```{fastmcp-tool} pane_tools.capture_since +``` + +**Use when** you need to observe the same pane repeatedly — tailing logs, +watching a long-running command, checking a daemon, or revisiting a terminal +without paying to re-read the same scrollback every turn. The first call returns +the current visible screen plus a cursor; later calls pass that cursor back and +receive only rows written or rewritten after it. + +**Avoid when** you control the command and only need completion — compose +`tmux wait-for -S ` into the command and call +{tooliconl}`wait-for-channel`. If you need a one-shot content + metadata view, +use {tooliconl}`snapshot-pane`; if you do not know which pane contains text, +use {tooliconl}`search-panes`. + +**Side effects:** None. Readonly. + +**Example:** + +Start a cursor with the currently visible screen: + +```json +{ + "tool": "capture_since", + "arguments": { + "pane_id": "%2" + } +} +``` + +Response: + +```json +{ + "pane_id": "%2", + "cursor": "capture-since-v1:...", + "lines": [ + "$ pytest -vv", + "tests/test_api.py::test_health PASSED" + ], + "elapsed_seconds": 0.003, + "lines_missed": false, + "truncated": false, + "truncated_lines": 0, + "truncated_bytes": 0 +} +``` + +Read only content since that cursor: + +```json +{ + "tool": "capture_since", + "arguments": { + "cursor": "capture-since-v1:..." + } +} +``` + +The cursor carries the original pane id, so the follow-up call does not need +`pane_id`. If you pass both, they must match; a cursor for another pane raises a +tool error instead of silently reading the wrong process. + +If nothing new was written after the cursor, `lines` is empty and the response +still includes a fresh cursor for the same pane. If the cursor row scrolled into +retained history, the tool can still return an exact delta; retained scrollback +is not a loss condition. + +`lines_missed` becomes `true` when tmux has cleared or trimmed the history +needed to compute an exact delta. In that case, `lines` is a conservative +current visible capture and the response includes a fresh cursor. + +Pane lifecycle is part of the cursor contract. If the pane dies or is respawned, +the call raises a tool error instead of reading from a different process that +reused the same pane id. + +`truncated`, `truncated_lines`, and `truncated_bytes` are structured metadata. +No truncation marker is injected into `lines`, so clients can display terminal +text without parsing an in-band header. + +The cursor is intentionally opaque. It is based on tmux grid state +(`history_size + cursor_y`) and pane lifecycle fields (`pane_id`, `pane_pid`); +see tmux's grid and capture implementation in +[grid.c](https://github.com/tmux/tmux/blob/134ba6c/grid.c) and +[cmd-capture-pane.c](https://github.com/tmux/tmux/blob/134ba6c/cmd-capture-pane.c), +and libtmux's +[`Pane.capture_pane()`](https://github.com/tmux-python/libtmux/blob/v0.58.0/src/libtmux/pane.py). + +```{fastmcp-tool-input} pane_tools.capture_since +``` diff --git a/docs/tools/pane/index.md b/docs/tools/pane/index.md index bdb1333f..03a06c00 100644 --- a/docs/tools/pane/index.md +++ b/docs/tools/pane/index.md @@ -9,6 +9,10 @@ Pane-scoped tools — read and drive individual terminals, wait for output, copy Read visible or scrollback text from a pane. ::: +:::{grid-item-card} {tooliconl}`capture-since` +Start a cursor, then read only new pane output. +::: + :::{grid-item-card} {tooliconl}`search-panes` Search text across many panes in one call. ::: @@ -100,6 +104,7 @@ Terminate a pane. Destructive. :maxdepth: 1 capture-pane +capture-since search-panes snapshot-pane get-pane-info diff --git a/docs/tools/pane/search-panes.md b/docs/tools/pane/search-panes.md index 0560906f..adcb0b90 100644 --- a/docs/tools/pane/search-panes.md +++ b/docs/tools/pane/search-panes.md @@ -8,7 +8,7 @@ which pane has an error, finding a running process, or checking output without knowing which pane to look in. **Avoid when** you already know the target pane — use {tooliconl}`capture-pane` -directly. +for a one-shot read, or {tooliconl}`capture-since` for repeated observation. **Side effects:** None. Readonly. diff --git a/docs/tools/pane/wait-for-text.md b/docs/tools/pane/wait-for-text.md index 957e46c4..d1923a46 100644 --- a/docs/tools/pane/wait-for-text.md +++ b/docs/tools/pane/wait-for-text.md @@ -7,9 +7,9 @@ server to start, a build to complete, or a prompt to return. **Avoid when** the expected text may never appear — always set a reasonable -`timeout`. For known output, {tooliconl}`capture-pane` after a known delay -may suffice, but `wait_for_text` is preferred because it adapts to variable -timing. +`timeout`. For repeated observation or tailing, use +{tooliconl}`capture-since`; for command completion you control, use +{tooliconl}`wait-for-channel`. **Side effects:** None. Readonly. Blocks until text appears or timeout. diff --git a/docs/topics/architecture.md b/docs/topics/architecture.md index 6bff8e98..05d536d7 100644 --- a/docs/topics/architecture.md +++ b/docs/topics/architecture.md @@ -18,7 +18,7 @@ src/libtmux_mcp/ server_tools.py # list_sessions, create_session, kill_server, get_server_info session_tools.py # list_windows, create_window, rename_session, kill_session window_tools.py # list_panes, split_window, rename_window, kill_window, select_layout, resize_window - pane_tools.py # send_keys, capture_pane, resize_pane, kill_pane, set_pane_title, get_pane_info, clear_pane, search_panes, wait_for_text + pane_tools.py # send_keys, capture_pane, capture_since, resize_pane, kill_pane, set_pane_title, get_pane_info, clear_pane, search_panes, wait_for_text option_tools.py # show_option, set_option env_tools.py # show_environment, set_environment resources/ diff --git a/docs/topics/concepts.md b/docs/topics/concepts.md index 74e8c277..156f7bc0 100644 --- a/docs/topics/concepts.md +++ b/docs/topics/concepts.md @@ -45,7 +45,7 @@ For pane tools, you can combine parameters to narrow the search: `session_name` Tools fall into three categories: -- **Discovery** — Read-only operations: `list_sessions`, `list_windows`, `list_panes`, `capture_pane`, `get_pane_info`, `find_pane_by_position`, `search_panes`, `wait_for_text`, `show_option`, `show_environment` +- **Discovery** — Read-only operations: `list_sessions`, `list_windows`, `list_panes`, `capture_pane`, `capture_since`, `get_pane_info`, `find_pane_by_position`, `search_panes`, `wait_for_text`, `show_option`, `show_environment` - **Mutation** — Create, modify, or send input: `create_session`, `create_window`, `split_window`, `send_keys`, `rename_*`, `resize_*`, `set_pane_title`, `clear_pane`, `select_layout`, `set_option`, `set_environment` - **Destruction** — Remove tmux objects: `kill_server`, `kill_session`, `kill_window`, `kill_pane` diff --git a/docs/topics/gotchas.md b/docs/topics/gotchas.md index 76a63441..02095c51 100644 --- a/docs/topics/gotchas.md +++ b/docs/topics/gotchas.md @@ -8,7 +8,7 @@ Things that will bite you if you don't know about them in advance. For symptom-b {tooliconl}`list-panes` and {tooliconl}`list-windows` search **metadata** — names, IDs, current command. They do not search what is displayed in the terminal. -To find text that is visible in terminals, use {tooliconl}`search-panes`. To read what a specific pane shows, use {tooliconl}`capture-pane`. +To find text that is visible in terminals, use {tooliconl}`search-panes`. To read what a specific pane shows once, use {tooliconl}`capture-pane`; to keep watching that pane, use {tooliconl}`capture-since`. This is the most common source of agent confusion. The server instructions already warn about this, but it bears repeating: if a user asks "which pane mentions error", the answer is `search_panes`, not `list_panes`. @@ -41,6 +41,15 @@ The capture above may return the terminal state **before** pytest runs. Compose For output the agent does not author (third-party logs, daemon prompts, interactive supervisors), substitute {tooliconl}`wait-for-text` for `wait_for_channel`. See {ref}`recipes` for the complete pattern. +## Repeated `capture_pane` calls resend old output + +If you are tailing a pane or checking a long-running process over several +turns, repeated {tooliconl}`capture-pane` calls keep returning the same visible +screen and scrollback. Use {tooliconl}`capture-since` instead: the first call +returns a cursor, and follow-up calls return only output written or rewritten +after that cursor. If tmux has already trimmed or cleared the needed history, +the result marks `lines_missed=true` and gives you a fresh cursor. + ## Window names are not unique across sessions Two sessions can each have a window named "editor". Targeting by `window_name` alone is ambiguous — always include `session_name` or use the globally unique `window_id` (e.g., `@0`, `@1`). diff --git a/docs/topics/index.md b/docs/topics/index.md index 8cfad8cb..6e935f91 100644 --- a/docs/topics/index.md +++ b/docs/topics/index.md @@ -68,8 +68,8 @@ hierarchy. :::{grid-item-card} Pagination :link: pagination :link-type: doc -Protocol-level cursors vs tool-level ``offset`` / ``limit`` (as in -``search_panes``). +Protocol cursors, ``search_panes`` paging, and ``capture_since`` +observation cursors. ::: :::: diff --git a/docs/topics/pagination.md b/docs/topics/pagination.md index 32ba6f8b..10ad5e4b 100644 --- a/docs/topics/pagination.md +++ b/docs/topics/pagination.md @@ -9,7 +9,7 @@ libtmux-mcp follows the a page is truncated, and accept ``cursor`` on the next call to resume. -## Two places pagination shows up +## Where cursors and pages show up ### Protocol-level list calls @@ -20,9 +20,9 @@ a sensible page size, encodes the cursor in an opaque base64 blob, and replays state from it. Callers only need to thread through ``nextCursor`` if they consume the raw MCP protocol. -### Tool-level pagination on ``search_panes`` +### Tool-level result paging on ``search_panes`` -One libtmux-mcp tool owns its own pagination surface because a +One libtmux-mcp tool owns its own paging surface because a single tmux server can carry tens of thousands of pane lines: - {tool}`search-panes` returns a @@ -32,20 +32,45 @@ single tmux server can carry tens of thousands of pane lines: - Agents detect ``truncated=True`` and re-call with a higher ``offset`` to page through the match set. -This is application-level pagination (not MCP-cursor pagination) — +This is application-level paging (not MCP-cursor pagination) — the agent decides how many matches it needs and when to stop. +### Tool-level observation cursors on ``capture_since`` + +{tool}`capture-since` also has a ``cursor`` parameter, but it is +not a pagination cursor. The first call captures the current visible +pane and returns an opaque observation checkpoint. Follow-up calls +pass that cursor back to receive only rows written or rewritten after +the checkpoint while tmux still retains the needed history. + +Because the cursor points into live tmux grid state, it has different +failure modes from protocol pagination: + +- If the pane output scrolls into retained history, the cursor can + still produce an exact delta. +- If tmux clears or trims the needed history, the response sets + ``lines_missed=True`` and returns a conservative current visible + capture with a fresh cursor. +- If the pane dies or is respawned, the cursor is invalid because it + would otherwise point at a different process's terminal state. + +This is application-level observation, not a stable collection scan. +Use it to reduce repeated pane reads, not to page through search +matches. + ## Why separate paths Protocol-level cursors are for **collections the server owns end-to-end**: the tool / prompt / resource registries. The server knows what it has, so an opaque cursor is cheap. -Tool-level pagination is for **collections derived from live tmux -state**: capturing every pane's contents and running a regex is -expensive, and the result set can change mid-scan (new panes open, -old ones close). Exposing ``offset`` / ``limit`` lets the agent -bound cost explicitly, without pretending the snapshot is stable. +Tool-level paging and observation cursors are for **state derived +from live tmux panes**. Capturing every pane's contents and running a +regex is expensive, and the result set can change mid-scan (new panes +open, old ones close). Repeatedly reading one pane has the opposite +cost shape: the target is known, but unchanged scrollback wastes +model context. libtmux-mcp exposes each contract separately instead +of pretending live terminal state is one stable list. ## Further reading @@ -53,3 +78,6 @@ bound cost explicitly, without pretending the snapshot is stable. - {class}`~libtmux_mcp.models.SearchPanesResult` — the structured wrapper for ``search_panes`` - {tool}`search-panes` — the tool itself +- {class}`~libtmux_mcp.models.CaptureSinceResult` — the structured + response for ``capture_since`` +- {tool}`capture-since` — incremental observation for a known pane diff --git a/docs/topics/prompting.md b/docs/topics/prompting.md index c2c13b05..c7450a8f 100644 --- a/docs/topics/prompting.md +++ b/docs/topics/prompting.md @@ -14,16 +14,17 @@ Every MCP client receives these instructions when connecting to the libtmux-mcp libtmux MCP server for programmatic tmux control. tmux hierarchy: Server > Session > Window > Pane. Use pane_id (e.g. '%1') as the preferred targeting method - it is globally unique within a tmux server. -Use send_keys to execute commands and capture_pane to read output. All -tools accept an optional socket_name parameter for multi-server support -(defaults to LIBTMUX_SOCKET env var). +Use send_keys to execute commands, capture_pane for one-shot reads, and +capture_since for repeated observation. All tools accept an optional +socket_name parameter for multi-server support (defaults to +LIBTMUX_SOCKET env var). IMPORTANT — metadata vs content: list_windows, list_panes, and list_sessions only search metadata (names, IDs, current command). To find text that is actually visible in terminals — when users ask what panes 'contain', 'mention', 'show', or 'have' — use search_panes to -search across all pane contents, or list_panes + capture_pane on each -pane for manual inspection. +search across all pane contents, capture_since for repeated reads of a +known pane, or capture_pane for a one-shot manual inspection. ``` The server also dynamically adds: @@ -66,6 +67,7 @@ These natural-language prompts reliably trigger the right tool sequences: | [Start the dev server and wait until it's ready]{.prompt} | {toolref}`send-keys` → {toolref}`wait-for-text` (for "listening on" — third-party output the agent doesn't author) | | [Spin up the dev server in the bottom-right pane]{.prompt} | {toolref}`find-pane-by-position` (corner=bottom-right) → {toolref}`send-keys` → {toolref}`wait-for-text` (for the server's readiness banner) | | [Check if any pane has errors]{.prompt} | {toolref}`search-panes` with pattern "error" | +| [Keep watching the server pane]{.prompt} | {toolref}`capture-since` with the previous cursor | | [Set up a workspace with editor, server, and tests]{.prompt} | {toolref}`create-session` → {toolref}`split-window` (x2) → {toolref}`set-pane-title` (x3) | | [What's running in my tmux sessions?]{.prompt} | {toolref}`list-sessions` → {toolref}`list-panes` → {toolref}`capture-pane` | | [Kill the old workspace session]{.prompt} | {toolref}`kill-session` (after confirming target) | @@ -95,7 +97,8 @@ This keeps output accessible for later inspection. For command completion, compose `tmux wait-for -S ` into the shell command and call wait_for_channel — deterministic, no polling. Use wait_for_text or wait_for_content_change for observation flows -(third-party logs, daemon prompts). Never capture_pane immediately +(third-party logs, daemon prompts), and use capture_since when you +need to read the same pane repeatedly. Never capture_pane immediately after send_keys — the command may still be running. ``` @@ -139,6 +142,6 @@ When an agent is unsure which tool to use, these rules help: 1. **Discovery first**: Call {toolref}`list-sessions` or {toolref}`list-panes` before acting on specific targets 2. **Prefer IDs**: Once you have a `pane_id`, use it for all subsequent calls — it never changes during the pane's lifetime -3. **Wait, don't poll**: For commands the agent authors, prefer {toolref}`wait-for-channel` with `tmux wait-for -S ` composed into the command — deterministic and race-free. Fall back to {toolref}`wait-for-text` or {toolref}`wait-for-content-change` for output the agent doesn't author. Never call {toolref}`capture-pane` in a retry loop. +3. **Wait, don't poll**: For commands the agent authors, prefer {toolref}`wait-for-channel` with `tmux wait-for -S ` composed into the command — deterministic and race-free. Use {toolref}`capture-since` for repeated observation, and fall back to {toolref}`wait-for-text` or {toolref}`wait-for-content-change` for output the agent doesn't author. Never call {toolref}`capture-pane` in a retry loop. 4. **Content vs. metadata**: If looking for text *in* a terminal, use {toolref}`search-panes`. If looking for pane *properties* (name, PID, path), use {toolref}`list-panes` or {toolref}`get-pane-info` 5. **Destructive tools are opt-in**: Never kill sessions, windows, or panes unless the user explicitly asks diff --git a/docs/topics/safety.md b/docs/topics/safety.md index 7138463f..565fb5c7 100644 --- a/docs/topics/safety.md +++ b/docs/topics/safety.md @@ -134,6 +134,7 @@ Each tool carries MCP tool annotations that hint at its behavior: | {ref}`list-windows` | {badge}`readonly` | true | false | true | | {ref}`list-panes` | {badge}`readonly` | true | false | true | | {ref}`capture-pane` | {badge}`readonly` | true | false | true | +| {ref}`capture-since` | {badge}`readonly` | true | false | true | | {ref}`get-pane-info` | {badge}`readonly` | true | false | true | | {ref}`search-panes` | {badge}`readonly` | true | false | true | | {ref}`wait-for-text` | {badge}`readonly` | true | false | true | diff --git a/docs/topics/troubleshooting.md b/docs/topics/troubleshooting.md index 977aa280..93e5dde0 100644 --- a/docs/topics/troubleshooting.md +++ b/docs/topics/troubleshooting.md @@ -75,7 +75,7 @@ Symptom-based guide. Find your problem, follow the steps. 2. **Special characters**: tmux interprets some key names (e.g. `C-c`, `Enter`). If sending literal text, use `literal=true`. -3. **Timing**: After `send_keys`, prefer composing `tmux wait-for -S ` into the shell command and calling `wait_for_channel` for deterministic completion. Use `wait_for_text` or `wait_for_content_change` only when waiting on output you do not author. Don't `capture_pane` immediately — the command may still be running. +3. **Timing**: After {toolref}`send-keys`, prefer composing `tmux wait-for -S ` into the shell command and calling {toolref}`wait-for-channel` for deterministic completion. Use {toolref}`capture-since` for repeated observation, and use {toolref}`wait-for-text` or {toolref}`wait-for-content-change` only when waiting on output you do not author. Don't call {toolref}`capture-pane` immediately — the command may still be running. ## Silent startup failure diff --git a/src/libtmux_mcp/middleware.py b/src/libtmux_mcp/middleware.py index 81d10900..b3e0dc24 100644 --- a/src/libtmux_mcp/middleware.py +++ b/src/libtmux_mcp/middleware.py @@ -353,6 +353,7 @@ class TailPreservingResponseLimitingMiddleware(ResponseLimitingMiddleware): so callers can detect the cap fired. Used as a global backstop for :func:`libtmux_mcp.tools.pane_tools.io.capture_pane`, + :func:`libtmux_mcp.tools.pane_tools.capture_since.capture_since`, :func:`libtmux_mcp.tools.pane_tools.meta.snapshot_pane`, and :func:`libtmux_mcp.tools.pane_tools.search.search_panes`. Per-tool caps at the tool layer fire first under normal operation; this diff --git a/src/libtmux_mcp/models.py b/src/libtmux_mcp/models.py index c46cb00b..2dd3846e 100644 --- a/src/libtmux_mcp/models.py +++ b/src/libtmux_mcp/models.py @@ -243,6 +243,37 @@ class WaitForTextResult(BaseModel): ) +class CaptureSinceResult(BaseModel): + """Incremental pane capture result with an opaque resume cursor.""" + + pane_id: str = Field(description="Pane ID that was captured") + cursor: str = Field(description="Opaque cursor to pass back to ``capture_since``") + lines: list[str] = Field( + default_factory=list, + description="Captured lines, oldest first and tail-preserved if truncated", + ) + elapsed_seconds: float = Field(description="Time spent capturing in seconds") + lines_missed: bool = Field( + default=False, + description=( + "True when prior history was no longer available, so ``lines`` " + "is a conservative current visible capture rather than a complete delta" + ), + ) + truncated: bool = Field( + default=False, + description="True when ``lines`` was truncated by max_lines or max_bytes", + ) + truncated_lines: int = Field( + default=0, + description="Number of lines dropped from the head when truncating", + ) + truncated_bytes: int = Field( + default=0, + description="Approximate UTF-8 bytes dropped from the head when truncating", + ) + + class PaneSnapshot(BaseModel): """Rich screen capture with metadata: content, cursor, mode, and scroll state.""" diff --git a/src/libtmux_mcp/prompts/recipes.py b/src/libtmux_mcp/prompts/recipes.py index 810f3ef3..c0a651a7 100644 --- a/src/libtmux_mcp/prompts/recipes.py +++ b/src/libtmux_mcp/prompts/recipes.py @@ -71,7 +71,10 @@ def diagnose_failing_pane(pane_id: str) -> str: Uses ``snapshot_pane`` (content + cursor + mode + scroll state in one call) instead of ``capture_pane`` + ``get_pane_info`` so - the agent sees everything in a single protocol call. + the agent sees everything in a single protocol call. When the + diagnosis needs another read after waiting or observing, the + rendered recipe points agents at ``capture_since`` instead of a + repeated full capture. Parameters ---------- @@ -83,9 +86,12 @@ def diagnose_failing_pane(pane_id: str) -> str: 1. Call `snapshot_pane(pane_id="{pane_id}")` to get content, cursor position, pane mode, and scroll state in one call. 2. If the content looks truncated, re-call with `max_lines=None`. -3. Identify the last command that ran (look at the prompt line and +3. If you need to watch the pane across more than one turn, call + `capture_since(pane_id="{pane_id}")`, keep the returned cursor, + and pass it to later `capture_since(cursor=...)` calls. +4. Identify the last command that ran (look at the prompt line and the line above it) and the last non-empty output line. -4. Propose a root cause hypothesis and a minimal command to verify +5. Propose a root cause hypothesis and a minimal command to verify it (do NOT execute anything yet — produce the plan first). """ diff --git a/src/libtmux_mcp/server.py b/src/libtmux_mcp/server.py index 8b36557d..1ff4afc9 100644 --- a/src/libtmux_mcp/server.py +++ b/src/libtmux_mcp/server.py @@ -55,9 +55,9 @@ # only fall back to a server-level segment when the gap is *server-shaped* # (e.g. an entire tool family is intentionally missing). # -# Output text is byte-identical to the previous monolith; tests assert on -# substrings of ``_BASE_INSTRUCTIONS``, so keeping the join shape stable -# matters. +# Tests assert on substrings of ``_BASE_INSTRUCTIONS``, so the join +# shape (segment count, ``"\n\n"`` separator) must stay stable even as +# individual instruction strings evolve. # --------------------------------------------------------------------------- _INSTR_HIERARCHY = ( @@ -83,20 +83,20 @@ ) _INSTR_METADATA_VS_CONTENT = ( - "metadata vs content: list_windows, list_panes, list_sessions search " - "metadata only. Use search_panes or capture_pane to find text inside " - "terminals — what panes 'contain', 'mention', 'show'." + "metadata vs content: list_windows/list_panes/list_sessions search " + "metadata only. Use search_panes/capture_since/capture_pane for terminal " + "text — what panes 'contain', 'mention', 'show'." ) _INSTR_READ_TOOLS = ( - "Prefer snapshot_pane over capture_pane + get_pane_info. " - "display_message evaluates a tmux format string against a target." + "Prefer snapshot_pane over capture_pane + get_pane_info; capture_since " + "for repeated observation/tailing; display_message for tmux formats." ) _INSTR_WAIT_NOT_POLL = ( "WAIT, DON'T POLL: prefer wait_for_channel (compose `tmux wait-for -S`) " - "for command completion. Else wait_for_text / wait_for_content_change " - "for output you don't author." + "for command completion; capture_since for repeated observation. " + "Else wait_for_text/wait_for_content_change for output you don't author." ) #: Gap-explainer: write-hook tools are intentionally absent. See module @@ -162,7 +162,7 @@ def _build_instructions(safety_level: str = TAG_MUTATING) -> str: # separate LIBTMUX_DISCOVERABILITY knob. if safety_level == TAG_READONLY: parts.append( - "\n\nReadonly mode: when uncertain about terminal state, prefer " + "\n\nReadonly mode: if uncertain, prefer " "one read-only probe (snapshot_pane, list_panes, search_panes)." ) @@ -179,8 +179,7 @@ def _build_instructions(safety_level: str = TAG_MUTATING) -> str: if socket_name: context += f" (socket {socket_name})" context += ( - ". Tool results mark the caller's own pane is_caller=true; " - "filter list_panes for is_caller=true to answer " + ". Tool results mark is_caller=true; filter list_panes for it to answer " "'which pane am I in?' (no whoami tool)." ) parts.append(context) @@ -202,6 +201,7 @@ def _build_instructions(safety_level: str = TAG_MUTATING) -> str: #: structured responses from list/get tools stay under the cap naturally. _RESPONSE_LIMITED_TOOLS = [ "capture_pane", + "capture_since", "search_panes", "snapshot_pane", "show_buffer", diff --git a/src/libtmux_mcp/tools/pane_tools/__init__.py b/src/libtmux_mcp/tools/pane_tools/__init__.py index 12c84a4e..65269b92 100644 --- a/src/libtmux_mcp/tools/pane_tools/__init__.py +++ b/src/libtmux_mcp/tools/pane_tools/__init__.py @@ -23,6 +23,7 @@ TAG_MUTATING, TAG_READONLY, ) +from libtmux_mcp.tools.pane_tools.capture_since import capture_since from libtmux_mcp.tools.pane_tools.copy_mode import enter_copy_mode, exit_copy_mode from libtmux_mcp.tools.pane_tools.io import ( capture_pane, @@ -55,6 +56,7 @@ __all__ = [ "capture_pane", + "capture_since", "clear_pane", "display_message", "enter_copy_mode", @@ -86,6 +88,9 @@ def register(mcp: FastMCP) -> None: mcp.tool(title="Capture Pane", annotations=ANNOTATIONS_RO, tags={TAG_READONLY})( capture_pane ) + mcp.tool(title="Capture Since", annotations=ANNOTATIONS_RO, tags={TAG_READONLY})( + capture_since + ) mcp.tool( title="Resize Pane", annotations=ANNOTATIONS_MUTATING, tags={TAG_MUTATING} )(resize_pane) diff --git a/src/libtmux_mcp/tools/pane_tools/capture_since.py b/src/libtmux_mcp/tools/pane_tools/capture_since.py new file mode 100644 index 00000000..44ab2f63 --- /dev/null +++ b/src/libtmux_mcp/tools/pane_tools/capture_since.py @@ -0,0 +1,529 @@ +"""Incremental capture tool for tmux pane observation.""" + +from __future__ import annotations + +import asyncio +import base64 +import binascii +import hashlib +import json +import time +import typing as t +from dataclasses import dataclass + +from fastmcp.exceptions import ToolError + +from libtmux_mcp._utils import ( + _get_server, + _resolve_pane, + handle_tool_errors_async, +) +from libtmux_mcp.models import CaptureSinceResult +from libtmux_mcp.tools.pane_tools.io import CAPTURE_DEFAULT_MAX_LINES +from libtmux_mcp.tools.pane_tools.state import ( + _PaneState, + _raise_if_pane_lifecycle_changed, + _read_history_limit, + _read_pane_state, +) + +if t.TYPE_CHECKING: + from libtmux.pane import Pane + + +CAPTURE_SINCE_DEFAULT_MAX_LINES = CAPTURE_DEFAULT_MAX_LINES +CAPTURE_SINCE_DEFAULT_MAX_BYTES = 128_000 + +_CURSOR_PREFIX = "capture-since-v1:" +_CURSOR_VERSION = 1 +_STABLE_READ_ATTEMPTS = 3 + + +@dataclass(frozen=True) +class _CaptureCursor: + """Decoded capture_since cursor payload.""" + + pane_id: str + pane_pid: str + history_size: int + pane_height: int + anchor_abs: int + anchor_hash: str | None + below_hashes: tuple[str, ...] + + +@dataclass(frozen=True) +class _PaneRead: + """Synchronous tmux read result used by the async tool wrapper.""" + + state: _PaneState + cursor_rows: list[str] + lines: list[str] + lines_missed: bool + + +@dataclass(frozen=True) +class _LimitedLines: + """Tail-preserved result after line and byte limits are applied.""" + + lines: list[str] + truncated: bool + truncated_lines: int + truncated_bytes: int + + +def _line_hash(line: str) -> str: + """Return a stable content hash for a tmux row.""" + return hashlib.sha256(line.encode("utf-8", "surrogateescape")).hexdigest() + + +def _capture_rows( + pane: Pane, + *, + start: t.Literal["-"] | int | None = None, + end: t.Literal["-"] | int | None = None, +) -> list[str]: + """Return pane rows as a concrete list.""" + rows = pane.capture_pane(start=start, end=end) + if rows is None: + return [] + return list(rows) + + +def _capture_cursor_rows(pane: Pane, state: _PaneState) -> list[str]: + """Capture rows from the cursor through the visible bottom.""" + if state.cursor_y >= state.pane_height: + return [] + return _capture_rows(pane, start=state.cursor_y, end=None) + + +def _same_state(left: _PaneState, right: _PaneState) -> bool: + """Return True when two pane snapshots describe the same grid point.""" + return left == right + + +def _raise_if_dead_without_baseline(pane: Pane, state: _PaneState) -> None: + """Raise a tool error for a dead pane before a cursor exists.""" + if state.pane_dead: + msg = f"pane {pane.pane_id} died during pane read" + raise ToolError(msg) + + +def _read_stable_visible( + pane: Pane, + *, + baseline_pid: str | None = None, +) -> _PaneRead: + """Capture the visible pane and cursor rows with a stable state snapshot.""" + for _attempt in range(_STABLE_READ_ATTEMPTS): + before = _read_pane_state(pane) + if baseline_pid is None: + _raise_if_dead_without_baseline(pane, before) + expected_pid = before.pane_pid + else: + expected_pid = baseline_pid + _raise_if_pane_lifecycle_changed(pane, before, expected_pid) + + lines = _capture_rows(pane) + cursor_rows = _capture_cursor_rows(pane, before) + after = _read_pane_state(pane) + _raise_if_pane_lifecycle_changed(pane, after, expected_pid) + if _same_state(before, after): + return _PaneRead( + state=after, + cursor_rows=cursor_rows, + lines=lines, + lines_missed=False, + ) + + state = _read_pane_state(pane) + if baseline_pid is None: + _raise_if_dead_without_baseline(pane, state) + else: + _raise_if_pane_lifecycle_changed(pane, state, baseline_pid) + return _PaneRead( + state=state, + cursor_rows=_capture_cursor_rows(pane, state), + lines=_capture_rows(pane), + lines_missed=True, + ) + + +def _cursor_anchor_lost(cursor: _CaptureCursor, state: _PaneState) -> bool: + """Return True when sampled state proves tmux lost the cursor anchor.""" + bottom_abs = state.history_size + state.pane_height - 1 + if cursor.anchor_abs > bottom_abs: + return True + # A complete history wipe (``clear-history``) always destroys the + # anchor regardless of pane height — the grid is reset to zero. + if state.history_size == 0 and cursor.history_size > 0: + return True + # ``anchor_abs < history_size`` means the anchor has scrolled into + # retained history, where ``capture-pane -S`` can still address it + # with a negative start offset. + # + # The ``pane_height`` guard distinguishes resize-grow (which pulls + # rows from history back into the visible region without freeing + # data) from actual trim (where row data is destroyed). + return state.history_size < cursor.history_size and ( + state.pane_height <= cursor.pane_height + ) + + +def _history_limit_trim_risk( + cursor: _CaptureCursor, + state: _PaneState, + history_limit: int, +) -> bool: + """Return True when tmux may have rebased retained-history rows.""" + if history_limit <= 0: + return True + trim_batch = max(history_limit // 10, 1) + risk_floor = history_limit - trim_batch + return cursor.history_size >= risk_floor or state.history_size >= risk_floor + + +def _find_unique_cursor_match(rows: list[str], cursor: _CaptureCursor) -> int | None: + """Find one retained row sequence matching the cursor fingerprint.""" + if cursor.anchor_hash is None: + return None + + fingerprint = (cursor.anchor_hash, *cursor.below_hashes) + if len(rows) < len(fingerprint): + return None + + match_index: int | None = None + for index in range(len(rows) - len(fingerprint) + 1): + candidate = rows[index : index + len(fingerprint)] + candidate_hashes = tuple(_line_hash(line) for line in candidate) + if candidate_hashes != fingerprint: + continue + if match_index is not None: + return None + match_index = index + return match_index + + +def _drop_previously_seen_rows( + rows: list[str], + cursor: _CaptureCursor, +) -> list[str]: + """Drop the cursor anchor and below-cursor rows already represented.""" + if not rows: + return [] + + output: list[str] = [] + tail = rows + if cursor.anchor_hash is not None and _line_hash(rows[0]) == cursor.anchor_hash: + tail = rows[1:] + else: + output.append(rows[0]) + tail = rows[1:] + + drop = 0 + for expected_hash, line in zip(cursor.below_hashes, tail, strict=False): + if _line_hash(line) != expected_hash: + break + drop += 1 + output.extend(tail[drop:]) + return output + + +def _read_delta(pane: Pane, cursor: _CaptureCursor) -> _PaneRead: + """Capture rows since ``cursor`` or fall back to visible content on loss.""" + history_limit = _read_history_limit(pane) + for _attempt in range(_STABLE_READ_ATTEMPTS): + before = _read_pane_state(pane) + _raise_if_pane_lifecycle_changed(pane, before, cursor.pane_pid) + if _cursor_anchor_lost(cursor, before): + missed = _read_stable_visible(pane, baseline_pid=cursor.pane_pid) + return _PaneRead( + state=missed.state, + cursor_rows=missed.cursor_rows, + lines=missed.lines, + lines_missed=True, + ) + + trim_risk = _history_limit_trim_risk(cursor, before, history_limit) + start = cursor.anchor_abs - before.history_size + rows = ( + _capture_rows(pane, start="-", end=None) + if trim_risk + else ( + [] + if start >= before.pane_height + else _capture_rows(pane, start=start, end=None) + ) + ) + cursor_rows = _capture_cursor_rows(pane, before) + after = _read_pane_state(pane) + _raise_if_pane_lifecycle_changed(pane, after, cursor.pane_pid) + if _same_state(before, after): + if trim_risk: + match_index = _find_unique_cursor_match(rows, cursor) + if match_index is None: + missed = _read_stable_visible(pane, baseline_pid=cursor.pane_pid) + return _PaneRead( + state=missed.state, + cursor_rows=missed.cursor_rows, + lines=missed.lines, + lines_missed=True, + ) + rows = rows[match_index:] + return _PaneRead( + state=after, + cursor_rows=cursor_rows, + lines=_drop_previously_seen_rows(rows, cursor), + lines_missed=False, + ) + + missed = _read_stable_visible(pane, baseline_pid=cursor.pane_pid) + return _PaneRead( + state=missed.state, + cursor_rows=missed.cursor_rows, + lines=missed.lines, + lines_missed=True, + ) + + +def _build_cursor(pane_id: str, state: _PaneState, cursor_rows: list[str]) -> str: + """Encode the current cursor anchor as an opaque string.""" + payload: dict[str, t.Any] = { + "version": _CURSOR_VERSION, + "pane_id": pane_id, + "pane_pid": state.pane_pid, + "history_size": state.history_size, + "pane_height": state.pane_height, + "anchor_abs": state.history_size + state.cursor_y, + "anchor_hash": _line_hash(cursor_rows[0]) if cursor_rows else None, + "below_hashes": [_line_hash(line) for line in cursor_rows[1:]], + } + raw = json.dumps(payload, separators=(",", ":"), sort_keys=True).encode() + encoded = base64.urlsafe_b64encode(raw).decode().rstrip("=") + return f"{_CURSOR_PREFIX}{encoded}" + + +def _raise_invalid_cursor(reason: str) -> t.NoReturn: + """Raise a consistently worded invalid-cursor error.""" + msg = f"invalid capture_since cursor: {reason}" + raise ToolError(msg) + + +def _cursor_str(payload: t.Mapping[str, t.Any], key: str) -> str: + """Read a required string from a cursor payload.""" + value = payload.get(key) + if not isinstance(value, str) or not value: + reason = f"missing or invalid {key}" + _raise_invalid_cursor(reason) + return value + + +def _cursor_int(payload: t.Mapping[str, t.Any], key: str) -> int: + """Read a required non-negative integer from a cursor payload.""" + value = payload.get(key) + if not isinstance(value, int) or isinstance(value, bool) or value < 0: + reason = f"missing or invalid {key}" + _raise_invalid_cursor(reason) + return value + + +def _decode_cursor(cursor: str) -> _CaptureCursor: + """Decode and validate an opaque ``capture_since`` cursor.""" + if not cursor.startswith(_CURSOR_PREFIX): + reason = "unsupported cursor format" + _raise_invalid_cursor(reason) + encoded = cursor.removeprefix(_CURSOR_PREFIX) + padding = "=" * (-len(encoded) % 4) + try: + raw = base64.urlsafe_b64decode(f"{encoded}{padding}") + payload: t.Any = json.loads(raw) + except (binascii.Error, json.JSONDecodeError, UnicodeDecodeError) as err: + reason = "could not decode payload" + msg = f"invalid capture_since cursor: {reason}" + raise ToolError(msg) from err + + if not isinstance(payload, dict): + reason = "payload is not an object" + _raise_invalid_cursor(reason) + if payload.get("version") != _CURSOR_VERSION: + reason = "unsupported cursor version" + _raise_invalid_cursor(reason) + + anchor_hash_value = payload.get("anchor_hash") + if anchor_hash_value is not None and not isinstance(anchor_hash_value, str): + reason = "missing or invalid anchor_hash" + _raise_invalid_cursor(reason) + below_hashes_value = payload.get("below_hashes") + if not isinstance(below_hashes_value, list) or not all( + isinstance(item, str) for item in below_hashes_value + ): + reason = "missing or invalid below_hashes" + _raise_invalid_cursor(reason) + + return _CaptureCursor( + pane_id=_cursor_str(payload, "pane_id"), + pane_pid=_cursor_str(payload, "pane_pid"), + history_size=_cursor_int(payload, "history_size"), + pane_height=_cursor_int(payload, "pane_height"), + anchor_abs=_cursor_int(payload, "anchor_abs"), + anchor_hash=anchor_hash_value, + below_hashes=tuple(below_hashes_value), + ) + + +def _validate_limits(max_lines: int | None, max_bytes: int | None) -> None: + """Validate caller-supplied truncation limits.""" + if max_lines is not None and max_lines <= 0: + msg = f"max_lines must be positive or None (received {max_lines})" + raise ToolError(msg) + if max_bytes is not None and max_bytes <= 0: + msg = f"max_bytes must be positive or None (received {max_bytes})" + raise ToolError(msg) + + +def _encoded_size(lines: list[str]) -> int: + """Return UTF-8 byte size for the returned line payload.""" + return len("\n".join(lines).encode("utf-8", "surrogateescape")) + + +def _limit_lines( + lines: list[str], + *, + max_lines: int | None, + max_bytes: int | None, +) -> _LimitedLines: + """Apply tail-preserving line and byte limits.""" + kept = list(lines) + truncated_lines = 0 + truncated_bytes = 0 + + if max_lines is not None and len(kept) > max_lines: + dropped = kept[:-max_lines] + kept = kept[-max_lines:] + truncated_lines += len(dropped) + truncated_bytes += _encoded_size(dropped) + + if max_bytes is not None: + while kept and _encoded_size(kept) > max_bytes: + if len(kept) == 1: + encoded = kept[0].encode("utf-8", "surrogateescape") + truncated_bytes += max(len(encoded) - max_bytes, 0) + kept = [ + encoded[-max_bytes:].decode("utf-8", "ignore") + if max_bytes > 0 + else "" + ] + break + removed = kept.pop(0) + truncated_lines += 1 + truncated_bytes += len(f"{removed}\n".encode("utf-8", "surrogateescape")) + + return _LimitedLines( + lines=kept, + truncated=truncated_lines > 0 or truncated_bytes > 0, + truncated_lines=truncated_lines, + truncated_bytes=truncated_bytes, + ) + + +@handle_tool_errors_async +async def capture_since( + cursor: str | None = None, + pane_id: str | None = None, + session_name: str | None = None, + session_id: str | None = None, + window_id: str | None = None, + max_lines: int | None = CAPTURE_SINCE_DEFAULT_MAX_LINES, + max_bytes: int | None = CAPTURE_SINCE_DEFAULT_MAX_BYTES, + socket_name: str | None = None, +) -> CaptureSinceResult: + """Capture new tmux terminal scrollback since the previous cursor. + + Use for observation-first workflows: tailing a shell, watching a + long-running command, or repeatedly checking a tmux workspace pane + without re-sending the same visible screen every turn. The first + call with ``cursor=None`` returns the current visible pane and an + opaque cursor. Later calls pass that cursor back and receive only + rows written or rewritten after the cursor, as long as tmux still + retains the required scrollback history. + + If tmux history was cleared or trimmed before the cursor anchor, + the tool returns the current visible pane with ``lines_missed=True`` + and a fresh cursor. Malformed cursors, cursors for a different + pane, pane death, and pane respawn fail with ``ToolError`` so + agents do not accidentally observe the wrong process. + + Parameters + ---------- + cursor : str, optional + Opaque cursor returned by a prior ``capture_since`` call. When + omitted, the tool captures the current visible screen and + starts a new cursor. + pane_id : str, optional + Pane ID (e.g. '%1'). Optional when ``cursor`` is supplied; the + cursor carries the original pane id. + session_name : str, optional + Session name for pane resolution. + session_id : str, optional + Session ID (e.g. '$1') for pane resolution. + window_id : str, optional + Window ID for pane resolution. + max_lines : int or None + Maximum number of lines to return. Defaults to + :data:`CAPTURE_SINCE_DEFAULT_MAX_LINES`. Pass ``None`` to + disable line truncation. + max_bytes : int or None + Maximum UTF-8 bytes to return across ``lines``. Defaults to + :data:`CAPTURE_SINCE_DEFAULT_MAX_BYTES`. Pass ``None`` to + disable byte truncation. + socket_name : str, optional + tmux socket name. + + Returns + ------- + CaptureSinceResult + Structured lines, cursor, elapsed time, and truncation/loss + metadata. + """ + _validate_limits(max_lines, max_bytes) + decoded = _decode_cursor(cursor) if cursor is not None else None + if decoded is not None and not any( + value is not None for value in (pane_id, session_name, session_id, window_id) + ): + pane_id = decoded.pane_id + + server = _get_server(socket_name=socket_name) + pane = _resolve_pane( + server, + pane_id=pane_id, + session_name=session_name, + session_id=session_id, + window_id=window_id, + ) + assert pane.pane_id is not None + + if decoded is not None and pane.pane_id != decoded.pane_id: + msg = ( + f"cursor pane {decoded.pane_id} does not match requested pane " + f"{pane.pane_id}" + ) + raise ToolError(msg) + + start_time = time.monotonic() + if decoded is None: + read = await asyncio.to_thread(_read_stable_visible, pane) + else: + read = await asyncio.to_thread(_read_delta, pane, decoded) + + limited = _limit_lines(read.lines, max_lines=max_lines, max_bytes=max_bytes) + elapsed = time.monotonic() - start_time + return CaptureSinceResult( + pane_id=pane.pane_id, + cursor=_build_cursor(pane.pane_id, read.state, read.cursor_rows), + lines=limited.lines, + elapsed_seconds=round(elapsed, 3), + lines_missed=read.lines_missed, + truncated=limited.truncated, + truncated_lines=limited.truncated_lines, + truncated_bytes=limited.truncated_bytes, + ) diff --git a/src/libtmux_mcp/tools/pane_tools/state.py b/src/libtmux_mcp/tools/pane_tools/state.py new file mode 100644 index 00000000..ea3ced55 --- /dev/null +++ b/src/libtmux_mcp/tools/pane_tools/state.py @@ -0,0 +1,88 @@ +"""Shared tmux pane state helpers for read and wait tools.""" + +from __future__ import annotations + +import typing as t + +from fastmcp.exceptions import ToolError + +if t.TYPE_CHECKING: + from libtmux.pane import Pane + + +class _PaneState(t.NamedTuple): + """Per-read snapshot of tmux pane grid and lifecycle state. + + Read in one ``display-message`` round-trip so callers avoid + growing subprocess cost linearly with every required format field. + ``history_size + cursor_y`` gives the absolute tmux grid row of + the current cursor. + + Wire format parsed by :func:`_read_pane_state`:: + + #{history_size}|#{cursor_y}|#{pane_height}|#{pane_pid}|#{pane_dead} + + Fields are ``|``-separated: the first three are non-negative + integers, ``pane_pid`` is a decimal PID string, and ``pane_dead`` + is the literal ``"0"`` or ``"1"``. + """ + + history_size: int + cursor_y: int + pane_height: int + pane_pid: str + pane_dead: bool + + +def _read_pane_state(pane: Pane) -> _PaneState: + """Return a :class:`_PaneState` snapshot for ``pane``. + + Combines the tmux state reads needed by wait and incremental + capture tools into a single ``display-message`` call. ``pane_pid`` + and ``pane_dead`` surface respawn-pane and pane-death events that + invalidate cursor or baseline anchors. + """ + stdout = pane.display_message( + "#{history_size}|#{cursor_y}|#{pane_height}|#{pane_pid}|#{pane_dead}", + get_text=True, + ) + raw = stdout[0] if stdout else "0|0|0||0" + hs, cy, sy, pid, dead = raw.split("|", 4) + return _PaneState( + history_size=int(hs), + cursor_y=int(cy), + pane_height=int(sy), + pane_pid=pid, + pane_dead=dead == "1", + ) + + +def _raise_if_pane_lifecycle_changed( + pane: Pane, state: _PaneState, baseline_pid: str +) -> None: + """Raise ``ToolError`` when a cursor or wait baseline is invalid.""" + if state.pane_dead: + msg = f"pane {pane.pane_id} died; cursor/baseline anchor is no longer valid" + raise ToolError(msg) + if state.pane_pid != baseline_pid: + msg = ( + f"pane {pane.pane_id} was respawned " + f"(pid {baseline_pid} -> {state.pane_pid}); " + "cursor/baseline anchor is no longer valid" + ) + raise ToolError(msg) + + +def _read_history_limit(pane: Pane) -> int: + """Read the pane's ``history-limit`` once. + + Fixed at pane creation — a retroactive ``set-option history-limit`` + only takes effect in tmux 3.7+ (commit ``e7b1575``); older versions + require a new pane. Safe to cache for the lifetime of a single + wait or capture operation. Kept separate from :func:`_read_pane_state` + so per-tick reads do not pay for a value that never changes between + ticks. + """ + stdout = pane.display_message("#{history_limit}", get_text=True) + raw = stdout[0] if stdout else "0" + return int(raw) diff --git a/src/libtmux_mcp/tools/pane_tools/wait.py b/src/libtmux_mcp/tools/pane_tools/wait.py index 1ac951b6..1ffa7b15 100644 --- a/src/libtmux_mcp/tools/pane_tools/wait.py +++ b/src/libtmux_mcp/tools/pane_tools/wait.py @@ -21,9 +21,11 @@ ContentChangeResult, WaitForTextResult, ) - -if t.TYPE_CHECKING: - from libtmux.pane import Pane +from libtmux_mcp.tools.pane_tools.state import ( + _raise_if_pane_lifecycle_changed, + _read_history_limit, + _read_pane_state, +) logger = logging.getLogger(__name__) @@ -99,76 +101,6 @@ async def _maybe_log( return -class _PaneState(t.NamedTuple): - """Per-tick snapshot of pane state used by wait tools. - - Read in one ``display-message`` round-trip so the loop costs two - subprocesses per tick (state + capture) instead of growing - linearly with each new field. ``|`` is the field separator — - history/cursor/height are integers, ``pane_pid`` is a numeric PID - string, and ``pane_dead`` is the literal ``"0"``/``"1"`` flag. - """ - - history_size: int - cursor_y: int - pane_height: int - pane_pid: str - pane_dead: bool - - -def _read_pane_state(pane: Pane) -> _PaneState: - """Return a :class:`_PaneState` snapshot for ``pane``. - - Combines the per-tick reads ``wait_for_text`` needs into a single - ``display-message`` call. ``history_size + cursor_y`` gives the - absolute grid anchor at entry; ``pane_height`` gates the bottom- - row capture clip; ``pane_pid`` and ``pane_dead`` surface - respawn-pane and pane-death events that invalidate the baseline. - """ - stdout = pane.display_message( - "#{history_size}|#{cursor_y}|#{pane_height}|#{pane_pid}|#{pane_dead}", - get_text=True, - ) - raw = stdout[0] if stdout else "0|0|0||0" - hs, cy, sy, pid, dead = raw.split("|", 4) - return _PaneState( - history_size=int(hs), - cursor_y=int(cy), - pane_height=int(sy), - pane_pid=pid, - pane_dead=dead == "1", - ) - - -def _raise_if_pane_lifecycle_changed( - pane: Pane, state: _PaneState, baseline_pid: str -) -> None: - """Raise ``ToolError`` when a wait baseline no longer describes the pane.""" - if state.pane_dead: - msg = f"pane {pane.pane_id} died during wait" - raise ToolError(msg) - if state.pane_pid != baseline_pid: - msg = ( - f"pane {pane.pane_id} was respawned during wait " - f"(pid {baseline_pid} -> {state.pane_pid}); " - "baseline anchor no longer valid" - ) - raise ToolError(msg) - - -def _read_history_limit(pane: Pane) -> int: - """Read the pane's ``history-limit`` once. - - Fixed at pane creation (retroactive change only lands in tmux 3.7+), - so the result is safe to cache for the lifetime of a wait. Kept out - of :func:`_read_pane_state` so the per-tick read doesn't pay for a - value that never changes between polls. - """ - stdout = pane.display_message("#{history_limit}", get_text=True) - raw = stdout[0] if stdout else "0" - return int(raw) - - @handle_tool_errors_async async def wait_for_text( pattern: str, diff --git a/tests/test_pane_tools.py b/tests/test_pane_tools.py index 7d28705f..793d6a30 100644 --- a/tests/test_pane_tools.py +++ b/tests/test_pane_tools.py @@ -10,6 +10,7 @@ from libtmux.test.retry import retry_until from libtmux_mcp.models import ( + CaptureSinceResult, ContentChangeResult, PaneContentMatch, PaneSnapshot, @@ -18,6 +19,7 @@ ) from libtmux_mcp.tools.pane_tools import ( capture_pane, + capture_since, clear_pane, display_message, enter_copy_mode, @@ -147,6 +149,469 @@ def test_capture_pane_max_lines_none_disables_truncation( assert "untrunc_line_19" in result +# --------------------------------------------------------------------------- +# capture_since tests +# --------------------------------------------------------------------------- + + +def _signal_after_shell_payload(mcp_server: Server, pane: Pane, payload: str) -> None: + """Run ``payload`` in ``pane`` and wait for shell completion.""" + import asyncio + import uuid + + from libtmux_mcp.tools.wait_for_tools import wait_for_channel + + channel = f"mcp_test_capture_since_{uuid.uuid4().hex[:16]}" + pane.send_keys(f"{payload}; tmux wait-for -S {channel}", enter=True) + asyncio.run( + wait_for_channel( + channel=channel, + timeout=5.0, + socket_name=mcp_server.socket_name, + ) + ) + + +def test_capture_since_first_call_returns_visible_screen_and_cursor( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Initial ``capture_since`` call captures visible content and returns a cursor.""" + import asyncio + + marker = "CAPTURE_SINCE_INITIAL_4xz" + _signal_after_shell_payload(mcp_server, mcp_pane, f"echo {marker}") + + result = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + assert isinstance(result, CaptureSinceResult) + assert result.pane_id == mcp_pane.pane_id + assert result.cursor + assert result.lines_missed is False + assert result.truncated is False + assert any(marker in line for line in result.lines) + + +def test_capture_since_followup_returns_only_new_output( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Follow-up calls return content written after the previous cursor.""" + import asyncio + + old_marker = "CAPTURE_SINCE_OLD_71k" + new_marker = "CAPTURE_SINCE_NEW_71k" + _signal_after_shell_payload(mcp_server, mcp_pane, f"echo {old_marker}") + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + _signal_after_shell_payload(mcp_server, mcp_pane, f"echo {new_marker}") + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + third = asyncio.run( + capture_since( + cursor=second.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert any(new_marker in line for line in second.lines) + assert not any(old_marker in line for line in second.lines) + assert third.lines == [] + assert second.pane_id == mcp_pane.pane_id + + +def test_capture_since_follows_anchor_into_retained_history( + mcp_server: Server, mcp_pane: Pane +) -> None: + """A cursor remains exact after its anchor scrolls into history.""" + import asyncio + + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + pane_height = int(mcp_pane.display_message("#{pane_height}", get_text=True)[0]) + markers = [f"CAPTURE_SINCE_SCROLL_{index:02d}" for index in range(pane_height + 8)] + payload = "printf '%s\\n' " + " ".join(markers) + + _signal_after_shell_payload(mcp_server, mcp_pane, payload) + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert second.lines_missed is False + assert any(markers[-1] in line for line in second.lines) + + +def test_capture_since_marks_lines_missed_after_history_limit_trim( + mcp_server: Server, mcp_pane: Pane +) -> None: + """History-limit trims return visible content with ``lines_missed=True``. + + Floods past ``history-limit`` then clears history to guarantee the + cursor anchor is destroyed. The flood alone is not deterministic — + tmux 3.6 retains enough of the original prompt that + ``_find_unique_cursor_match`` re-anchors on the surviving hash. + """ + import asyncio + + mcp_pane.session.cmd("set-option", "-g", "history-limit", "20") + fresh_pane = mcp_pane.window.split() + assert fresh_pane.pane_id is not None + + def _hlimit_locked() -> bool: + raw = fresh_pane.display_message("#{history_limit}", get_text=True) + return bool(raw) and int(raw[0]) == 20 + + try: + retry_until(_hlimit_locked, 5, raises=True) + # Build scrollback so the cursor has history_size > 0. + _signal_after_shell_payload( + mcp_server, + fresh_pane, + "for i in $(seq 1 25); do printf 'PREFILL_%03d\\n' \"$i\"; done", + ) + first = asyncio.run( + capture_since( + pane_id=fresh_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + payload = ( + "for i in $(seq 1 120); do printf 'CAPTURE_SINCE_TRIM_%03d\\n' \"$i\"; done" + ) + _signal_after_shell_payload(mcp_server, fresh_pane, payload) + # Guarantee anchor destruction: tmux 3.6 can retain the original + # prompt hash in scrollback even after flooding past history-limit. + fresh_pane.cmd("clear-history") + _signal_after_shell_payload( + mcp_server, fresh_pane, "echo CAPTURE_SINCE_TRIM_DONE" + ) + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert second.lines_missed is True + assert any("CAPTURE_SINCE_TRIM" in line for line in second.lines) + finally: + fresh_pane.kill() + + +def test_capture_since_reports_same_row_rewrite( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Carriage-return rewrites on the cursor row are reported as new content.""" + import asyncio + + script = ( + "printf OLD_REWRITE_CAPTURE_SINCE; " + "IFS= read -r line; " + "printf '\\r%s' \"$line\"; " + "sleep 60" + ) + mcp_pane.respawn(kill=True, shell=f"sh -c '{script}'") + retry_until( + lambda: any( + "OLD_REWRITE_CAPTURE_SINCE" in line for line in mcp_pane.capture_pane() + ), + 5, + raises=True, + ) + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + mcp_pane.send_keys("NEW_REWRITE_CAPTURE_SINCE", enter=True) + retry_until( + lambda: any( + "NEW_REWRITE_CAPTURE_SINCE" in line for line in mcp_pane.capture_pane() + ), + 5, + raises=True, + ) + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert any("NEW_REWRITE_CAPTURE_SINCE" in line for line in second.lines) + + +def test_capture_since_truncates_with_structured_metadata( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Line and byte limits tail-preserve output without in-band markers.""" + import asyncio + + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + payload = "; ".join(f"echo CAPTURE_SINCE_TRUNC_{i}" for i in range(6)) + _signal_after_shell_payload(mcp_server, mcp_pane, payload) + + line_limited = asyncio.run( + capture_since( + cursor=first.cursor, + max_lines=2, + socket_name=mcp_server.socket_name, + ) + ) + byte_limited = asyncio.run( + capture_since( + cursor=first.cursor, + max_bytes=32, + socket_name=mcp_server.socket_name, + ) + ) + + assert line_limited.truncated is True + assert line_limited.truncated_lines > 0 + assert len(line_limited.lines) == 2 + assert any("CAPTURE_SINCE_TRUNC_5" in line for line in line_limited.lines) + assert not line_limited.lines[0].startswith("[... truncated") + + assert byte_limited.truncated is True + assert byte_limited.truncated_bytes > 0 + assert len("\n".join(byte_limited.lines).encode()) <= 32 + + +def test_capture_since_rejects_malformed_cursor(mcp_server: Server) -> None: + """Malformed cursors fail clearly instead of falling back to another pane.""" + import asyncio + + with pytest.raises(ToolError, match="invalid capture_since cursor"): + asyncio.run( + capture_since( + cursor="not-a-valid-cursor", + socket_name=mcp_server.socket_name, + ) + ) + + +def test_capture_since_rejects_cursor_for_different_pane( + mcp_server: Server, mcp_session: Session, mcp_pane: Pane +) -> None: + """A cursor cannot be replayed against a different pane.""" + import asyncio + + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + other_pane = mcp_session.active_window.split() + try: + with pytest.raises(ToolError, match="cursor pane"): + asyncio.run( + capture_since( + cursor=first.cursor, + pane_id=other_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + finally: + other_pane.kill() + + +def test_capture_since_marks_lines_missed_after_history_clear( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Lost history returns current visible content with ``lines_missed=True``.""" + import asyncio + + fill = "; ".join(f"echo CAPTURE_SINCE_HISTORY_{i}" for i in range(40)) + _signal_after_shell_payload(mcp_server, mcp_pane, fill) + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + mcp_pane.cmd("clear-history") + _signal_after_shell_payload(mcp_server, mcp_pane, "echo CAPTURE_SINCE_AFTER_CLEAR") + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert second.lines_missed is True + assert any("CAPTURE_SINCE_AFTER_CLEAR" in line for line in second.lines) + assert second.cursor + + +def test_capture_since_marks_lines_missed_after_clear_history_with_resize( + mcp_server: Server, mcp_pane: Pane +) -> None: + """clear-history + pane resize still detects anchor loss. + + Regression: ``_cursor_anchor_lost`` used a ``pane_height`` guard + that returned False when the pane grew after ``clear-history``, + masking the complete history wipe. + """ + import asyncio + + fresh_pane = mcp_pane.window.split() + assert fresh_pane.pane_id is not None + + try: + fill = "; ".join(f"echo RESIZE_CLEAR_{i}" for i in range(40)) + _signal_after_shell_payload(mcp_server, fresh_pane, fill) + first = asyncio.run( + capture_since( + pane_id=fresh_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + + fresh_pane.cmd("clear-history") + assert fresh_pane.pane_height is not None + fresh_pane.set_height(int(fresh_pane.pane_height) + 3) + _signal_after_shell_payload(mcp_server, fresh_pane, "echo AFTER_RESIZE_CLEAR") + second = asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + assert second.lines_missed is True + assert any("AFTER_RESIZE_CLEAR" in line for line in second.lines) + finally: + fresh_pane.kill() + + +def test_capture_since_rejects_respawned_pane_cursor( + mcp_server: Server, mcp_pane: Pane +) -> None: + """Pane respawn invalidates the cursor's process identity.""" + import asyncio + + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + mcp_pane.respawn(kill=True, shell="sleep 60") + + with pytest.raises(ToolError, match="respawned"): + asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + + +def test_capture_since_rejects_dead_pane_cursor( + mcp_server: Server, mcp_session: Session, mcp_pane: Pane +) -> None: + """Pane death invalidates the cursor instead of returning stale content.""" + import asyncio + + first = asyncio.run( + capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + ) + window = mcp_session.active_window + window.cmd("set-option", "-w", "remain-on-exit", "on") + try: + mcp_pane.respawn(kill=True, shell="true") + + def _is_dead() -> bool: + out = mcp_pane.cmd("display-message", "-p", "#{pane_dead}").stdout + return bool(out) and out[0].strip() == "1" + + retry_until(_is_dead, 5, raises=True) + with pytest.raises(ToolError, match="died"): + asyncio.run( + capture_since( + cursor=first.cursor, + socket_name=mcp_server.socket_name, + ) + ) + finally: + window.cmd("set-option", "-wu", "remain-on-exit") + + +def test_capture_since_does_not_block_event_loop( + mcp_server: Server, mcp_pane: Pane, monkeypatch: pytest.MonkeyPatch +) -> None: + """``capture_since`` runs blocking tmux captures off the event loop.""" + import asyncio + import time as _time + + from libtmux.pane import Pane as _LibtmuxPane + + def _slow_capture(self: _LibtmuxPane, *_a: object, **_kw: object) -> list[str]: + _time.sleep(0.15) + return [] + + monkeypatch.setattr(_LibtmuxPane, "capture_pane", _slow_capture) + + async def _drive() -> int: + ticks = 0 + stop = asyncio.Event() + + async def _ticker() -> None: + nonlocal ticks + while not stop.is_set(): + ticks += 1 + await asyncio.sleep(0.01) + + async def _capture() -> None: + try: + await capture_since( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + finally: + stop.set() + + await asyncio.gather(_ticker(), _capture()) + return ticks + + ticks = asyncio.run(_drive()) + assert ticks >= 8, ( + f"ticker advanced only {ticks} times — capture_since is blocking the event loop" + ) + + def test_get_pane_info(mcp_server: Server, mcp_pane: Pane) -> None: """get_pane_info returns detailed pane info.""" result = get_pane_info( @@ -1530,7 +1995,7 @@ async def run() -> WaitForTextResult: await respawn_after_delay() return await wait_task - with pytest.raises(ToolError, match="respawned during wait"): + with pytest.raises(ToolError, match="respawned"): asyncio.run(run()) @@ -1553,7 +2018,7 @@ def _is_dead() -> bool: retry_until(_is_dead, 3, raises=True) - with pytest.raises(ToolError, match="died during wait"): + with pytest.raises(ToolError, match="died"): asyncio.run( wait_for_text( pattern="anything", @@ -2657,7 +3122,7 @@ async def run() -> ContentChangeResult: await respawn_after_delay() return await wait_task - with pytest.raises(ToolError, match="respawned during wait"): + with pytest.raises(ToolError, match="respawned"): asyncio.run(run()) @@ -2691,7 +3156,7 @@ async def run() -> ContentChangeResult: await exit_after_delay() return await wait_task - with pytest.raises(ToolError, match="died during wait"): + with pytest.raises(ToolError, match="died"): asyncio.run(run()) @@ -3157,6 +3622,7 @@ def test_respawn_pane_advertises_destructive_non_idempotent() -> None: # agents don't have to re-parse strings. Regression guard: # any future change that flattens one of these back to ``str`` # will break this test and force an explicit review. + ("capture_since", "CaptureSinceResult"), ("get_pane_info", "PaneInfo"), ("snapshot_pane", "PaneSnapshot"), ], @@ -3166,12 +3632,19 @@ def test_pane_read_tools_return_pydantic_models( ) -> None: """Read-heavy pane tools return their Pydantic model, not ``str``.""" tools: dict[str, t.Callable[..., t.Any]] = { + "capture_since": capture_since, "get_pane_info": get_pane_info, "snapshot_pane": snapshot_pane, } - result = tools[tool_name]( + maybe_result = tools[tool_name]( pane_id=mcp_pane.pane_id, socket_name=mcp_server.socket_name, ) + if tool_name == "capture_since": + import asyncio + + result = asyncio.run(t.cast(t.Coroutine[t.Any, t.Any, t.Any], maybe_result)) + else: + result = maybe_result assert type(result).__name__ == expected_type assert hasattr(result, "model_dump"), "expected a Pydantic BaseModel instance" diff --git a/tests/test_prompts.py b/tests/test_prompts.py index 253d6a21..389d37c9 100644 --- a/tests/test_prompts.py +++ b/tests/test_prompts.py @@ -141,6 +141,16 @@ def test_interrupt_gracefully_does_not_escalate() -> None: assert "do NOT escalate automatically" in text +def test_diagnose_failing_pane_uses_capture_since_for_repeated_reads() -> None: + """Diagnosis recipe routes repeated observation to ``capture_since``.""" + from libtmux_mcp.prompts.recipes import diagnose_failing_pane + + text = diagnose_failing_pane(pane_id="%1") + assert "snapshot_pane" in text + assert "capture_since" in text + assert "cursor" in text + + def test_build_dev_workspace_does_not_deadlock_on_screen_grabbers() -> None: """``build_dev_workspace`` guides post-launch waits to content-change. diff --git a/tests/test_server.py b/tests/test_server.py index 65db7d1d..759aada9 100644 --- a/tests/test_server.py +++ b/tests/test_server.py @@ -148,6 +148,7 @@ def test_base_instructions_surface_flagship_read_tools() -> None: """ assert "display_message" in _BASE_INSTRUCTIONS assert "snapshot_pane" in _BASE_INSTRUCTIONS + assert "capture_since" in _BASE_INSTRUCTIONS def test_base_instructions_prefer_wait_over_poll() -> None: @@ -161,6 +162,7 @@ def test_base_instructions_prefer_wait_over_poll() -> None: for command-completion synchronization. """ assert "wait_for_channel" in _BASE_INSTRUCTIONS + assert "capture_since" in _BASE_INSTRUCTIONS assert "wait_for_text" in _BASE_INSTRUCTIONS assert "wait_for_content_change" in _BASE_INSTRUCTIONS # The channel primitive should be named before the fallbacks so an @@ -487,7 +489,7 @@ def test_readonly_hint_visible_only_on_readonly_tier( # Discovery anchors — BM25 lexicon and alwaysLoad meta hints # --------------------------------------------------------------------------- -#: The six high-traffic discovery anchors. ToolSearch BM25-ranks +#: The high-traffic discovery anchors. ToolSearch BM25-ranks #: against tool ``description`` (FastMCP's griffe parser hands the #: leading paragraph in), so the anchors carry a buried-synonym #: lexicon plus an inline anti-trigger to widen the indexed surface @@ -500,6 +502,7 @@ def test_readonly_hint_visible_only_on_readonly_tier( "snapshot_pane", "search_panes", "capture_pane", + "capture_since", ] ) @@ -519,6 +522,7 @@ def test_readonly_hint_visible_only_on_readonly_tier( [ "send_keys", "capture_pane", + "capture_since", "snapshot_pane", "paste_text", "get_pane_info",