Skip to content

🤖 fix: add trace dumps for System 1 memory writer failures#1969

Open
ThomasK33 wants to merge 14 commits intomainfrom
memory-agent-hev5
Open

🤖 fix: add trace dumps for System 1 memory writer failures#1969
ThomasK33 wants to merge 14 commits intomainfrom
memory-agent-hev5

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Jan 27, 2026

Summary

Adds richer debug visibility for the System 1 memory writer:

  • Emits explicit debug-level skip reasons when scheduling is gated (System 1 disabled / child workspace / invalid interval).
  • When a run completes without writing (no memory_write tool call) or times out, writes a full execution trace (prompt/messages, step results, tool executions) to debug_obj/ for offline inspection.

Background

We saw cases where the memory writer appeared to “not run” (no [system1][memory] logs) or would exit with timedOut: true and no memory_write call. Most lifecycle logs were debug-only, and failures lacked enough detail to understand what the model did.

Implementation

  • MemoryWriterPolicy: adds explicit debug logs for early-return gates; threads triggerMessageId into the runner.
  • system1MemoryWriter: captures per-attempt messages, onStepFinish results, and tool execution records; dumps a JSON trace to ~/.mux*/debug_obj/<workspaceId>/system1_memory_writer/ in debug mode when the run fails to update memory.

Validation

  • make static-check
  • unit tests updated for the new triggerMessageId parameter

Risks

Low. Changes are scoped to debug logging and failure-path diagnostics; the writer’s normal success path is unchanged.


📋 Implementation Plan

Debug: System 1 memory writer not running / missing [system1][memory] logs

Context / Why

You set Settings → System 1 → “Write Interval (messages)” to 1, expecting the background System 1 memory writer to run after each assistant turn and update the project memory file. You’re not seeing any [system1][memory] log lines and it looks like the writer never runs.

From the current code:

  • The memory writer is scheduled on assistant stream end.
  • It is gated by System 1 experiment enabled (experiments.system1 === true) and only runs for root workspaces (it skips child/subtask workspaces).
  • Most lifecycle logs are debug-level, so you won’t see them unless you enable debug logging. The only info logs are emitted by the memory_write tool when a memory file is actually written.

Evidence (repo)

  • Scheduling + gates:
    • src/node/services/aiService.ts stores a context at stream start and calls memoryWriterPolicy.onAssistantStreamEnd(ctx) on stream-end.
    • src/node/services/system1/memoryWriterPolicy.ts returns early unless:
      • ctx.system1Enabled === true
      • !ctx.parentWorkspaceId
      • interval = config.taskSettings.memoryWriterIntervalMessages (defaults to 2)
    • The scheduler persists state to sessions/<workspaceId>/system1-memory-writer-state.json.
  • Experiment flag plumbing:
    • src/browser/utils/messages/sendOptions.ts passes experiments.system1 based on isExperimentEnabled(EXPERIMENT_IDS.SYSTEM_1).
    • src/common/orpc/schemas/stream.ts defines experiments.system1 in the RPC schema.
  • Logging:
    • src/node/services/log.ts supports MUX_LOG_LEVEL / MUX_DEBUG.
    • src/node/services/tools/memory_write.ts logs info on successful writes.
  • Memory file location:
    • src/node/services/tools/memoryCommon.ts writes to <muxHome>/memories/<projectId>.md.
    • src/common/constants/paths.ts defines <muxHome>: ~/.mux, ~/.mux-dev (when NODE_ENV=development), or MUX_ROOT.

Approach A (recommended): add explicit “no changes” debug logs (~5–20 LoC)

  1. In src/node/services/system1/memoryWriterPolicy.ts, change the existing debug line
    "[system1][memory] Memory writer produced no output" to something explicit like
    "[system1][memory] Memory writer exited without updating memory (no memory_write call)".
    • Keep it debug (so it only appears when MUX_LOG_LEVEL=debug / MUX_DEBUG=1).
    • Preserve the existing fields (timedOut, system1Model).
  2. (Optional) Add debug logs on the early-return gates in onAssistantStreamEnd:
    • System 1 disabled (ctx.system1Enabled !== true)
    • child workspace (ctx.parentWorkspaceId)
    • invalid interval
  3. Verify:
    • Run mux with MUX_LOG_LEVEL=debug (or MUX_DEBUG=1), then send a message.
    • You should now see an explicit log even when the writer doesn’t write.

Approach B: user-side diagnosis (0 LoC)

1) Confirm the hard gates

  1. Enable System 1: Settings → Experiments → System 1 must be ON.
    • The interval alone is not enough; the backend hard-returns when experiments.system1 !== true.
  2. Use a root workspace:
    • The memory writer does not run for child/subtask workspaces (parentWorkspaceId is set).
  3. Let the assistant finish normally:
    • The writer is scheduled on stream-end. If you interrupt/abort streams, it won’t schedule.

2) Confirm the interval persisted to disk

  1. Find your mux home directory:
    • Default: ~/.mux/
    • Dev: ~/.mux-dev/ (when NODE_ENV=development)
    • Override: $MUX_ROOT
  2. Check <muxHome>/config.json contains:
    • taskSettings.memoryWriterIntervalMessages: 1

3) Verify scheduling without logs (state file)

Even if you can’t see stdout/stderr, the scheduler persists a state file per workspace.

  1. Identify your workspaceId:
    • Start by listing sessions sorted by recent activity:
      • ls -lt <muxHome>/sessions | head
    • Then inspect candidate folders’ metadata.json until you find the workspace you’re testing.
  2. Inspect:
    • <muxHome>/sessions/<workspaceId>/system1-memory-writer-state.json

Expected behavior with interval=1:

  • After each assistant completion:
    • lastRunStartedAt updates (timestamp)
    • lastRunMessageId updates
    • turnsSinceLastRun returns to 0

Interpretation:

  • No file / never updatesonAssistantStreamEnd isn’t running (most often: System 1 experiment still OFF, or streams are aborting).
  • File updates but lastRunStartedAt never set → interval not being read as 1, or the run is permanently “in flight”.
  • lastRunStartedAt changes but no memory file changes → writer ran but didn’t call memory_write (model/tool support or credentials issue).

4) Verify memory output

  • Memory files live at <muxHome>/memories/*.md.
  • After an assistant turn, check whether anything in that folder updates:
    • ls -lt <muxHome>/memories | head

5) Enable the right logs (debug)

To see the scheduler/runner messages (which include skip reasons), you need debug logging:

  • MUX_LOG_LEVEL=debug or MUX_DEBUG=1

Important: for the desktop app, you must start mux from a shell that has those env vars (GUI launches won’t inherit shell env). Once enabled, look for:

  • [system1][memory] Skipping memory writer ...
  • [system1][memory] Memory writer completed
  • [system1][memory] Memory writer failed

Debug dumps (only in debug mode) land in:

  • <muxHome>/debug_obj/

6) Ensure the writer’s model supports tool calling

The memory writer uses:

  • agentAiDefaults.system1_memory_writer.modelString (if set)
  • otherwise it falls back to the workspace’s current chat model (ctx.modelString)

If that model/provider can’t do tool calling (or lacks credentials), the writer may never call memory_write.

Actions:

  • Set an explicit tool-capable model for system1_memory_writer (via Settings → Agents or by editing config.json).
  • Confirm provider API keys are configured for that model.
Why you might see “nothing” even when it’s running

Most memory-writer logs are debug. The only info line is from the memory_write tool after the model calls it. If the model never calls memory_write (tool calling unsupported / credentials missing / etc.), you’ll see no [system1][memory] output without enabling debug logging.


Generated with mux • Model: openai:gpt-5.2 • Thinking: high • Cost: $54.20

Change-Id: I426dd4cd261e9433e202ecd0e486b66808995ec2
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I88cbfc5705ad9edfaecf3d85e6a7a572377986ac
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I219643590ddc44473d8a5143fa6bb5d846b064ea
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id33f0e0fb271b639165567580b69e2473a69f991
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I62082e281752b1c882b66edf1455f8b34373f8c9
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ia11857921ddc9bd55a69aa799aa84e1d4a7471f8
Signed-off-by: Thomas Kosiewski <tk@coder.com>
- Persist per-workspace memory writer scheduling state in the session dir
- Recover from crash mid-run by treating incomplete runs as due
- Add tests for restart + crash recovery

Signed-off-by: Thomas Kosiewski <tk@coder.com>

---
_Generated with [`mux`](https://github.com/coder/mux) • Model: openai:gpt-5.2 • Thinking: xhigh_

Change-Id: Id0d36129afbbcdc7ad7b9d793f94b1bb114f2347
Reintroduce the Settings → System 1 section for taskSettings-backed tuning (bash output compaction + memory writer interval), without re-adding model/thinking defaults.

Also wire memoryWriterIntervalMessages through the config ORPC schemas so it persists instead of being dropped.

Change-Id: I5bcbd804e9faf4a34539efd8b3edb6061c2dc111
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ie2cc52b8822ce16eb29faa822f9e6a87f9533ac1
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: If2176d7f0812b3bcd4b1d588686dc4ee89da0b70
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I33dcd899ff2f0d06e10da3f50f0fffb8db9d1c05
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I16acb3fc32ef7b68984c139b05ade57d60f92f92
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: I55ff460bc36ead2792e7f88d438b2c351b80d4d3
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Ia51d7360c8f45a1a90e563e44c76dcd5782d78c4
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant