feat(experimental): SessionManager with chainable builder and multi-session example#1167
feat(experimental): SessionManager with chainable builder and multi-session example#1167mattzcarey wants to merge 2 commits intomainfrom
Conversation
|
1731b1e to
9b4b630
Compare
9b4b630 to
265f0dd
Compare
|
|
||
| export { AgentContextProvider } from "./providers/agent-context"; | ||
|
|
||
| export { SessionManager, type SessionInfo, type SessionManagerOptions } from "./manager"; |
There was a problem hiding this comment.
🔴 Missing changeset for new public API exports in packages/agents
The PR adds SessionManager, SessionInfo, and SessionManagerOptions as new public exports from agents/experimental/memory/session (packages/agents/src/experimental/memory/session/index.ts:69-73), which is a published npm entry point. Per the repository's AGENTS.md rules: "Changes to packages/ that affect the public API or fix bugs need a changeset." No changeset file for this change exists in .changeset/ — the only existing changeset (workflow-local-dev-lifecycle.md) is for an unrelated workflow change.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Intentionally skipped — this is under experimental/ which is explicitly excluded from changeset requirements. The API will change between releases.
265f0dd to
4a81e4d
Compare
agents
@cloudflare/ai-chat
@cloudflare/codemode
hono-agents
@cloudflare/shell
@cloudflare/think
@cloudflare/voice
@cloudflare/worker-bundler
commit: |
31027cc to
9384fe0
Compare
8a02693 to
96d3ed1
Compare
0abc924 to
73593e6
Compare
73593e6 to
16c28ec
Compare
0cc3ecf to
7558676
Compare
16c28ec to
9f59d05
Compare
9f59d05 to
6e61e89
Compare
7558676 to
c060c45
Compare
c060c45 to
8d4d4a1
Compare
6e61e89 to
6efa9d0
Compare
b851895 to
c8365ac
Compare
|
/bonk review |
3e7498d to
13d9fd3
Compare
There was a problem hiding this comment.
Review: SessionManager with chainable builder and multi-session example
Thorough review of the SessionManager implementation, multi-session example, and test suite. Most of the issues Devin flagged in earlier rounds appear to have been addressed (parentSessionId in fork, FTS table creation in _ensureTable, upsert check-then-update, deleteMessages scoped to session, fork _touch, env.d.ts references). The remaining open issue and a few additional observations follow.
Open issue: Shared compactFn closure leaks previousSummary across sessions
Devin's most recent unresolved finding is correct. In experimental/multi-session-agent/src/server.ts:33-39, a single compactFn is created at the class field level. createCompactFunction (packages/agents/src/experimental/memory/utils/compaction-helpers.ts:435) captures a previousSummary closure variable that persists across all calls. When session A is compacted, previousSummary is set to session A's summary. When session B is later compacted, session A's summary is passed to buildSummaryPrompt as prior context, cross-contaminating sessions.
The fix is straightforward — use a per-session compact function. For example:
private compactFns = new Map<string, (msgs: UIMessage[]) => Promise<UIMessage[]>>();
private getCompactFn(chatId: string) {
let fn = this.compactFns.get(chatId);
if (!fn) {
fn = createCompactFunction({
summarize: (prompt) =>
generateText({ model: this.getAI(), prompt }).then((r) => r.text),
protectHead: 1,
minTailMessages: 2,
tailTokenBudget: 100
});
this.compactFns.set(chatId, fn);
}
return fn;
}Then at line 135: this.getCompactFn(chatId)(history) instead of this.compactFn(history).
This is in the example, not the core SDK, so it won't affect library consumers, but since this example is user-facing learning material it should demonstrate correct patterns.
Minor observations
-
README is stale —
experimental/multi-session-agent/README.mdshows aChatAgentclass name and a single-sessionSessionAPI example (withcompaction.tokenThreshold,session.append(),session.getMessages()). None of this matches the actualMultiSessionAgent+SessionManagerimplementation. Since users will look at this to understand the example, it should be updated or at minimum show theSessionManager.create()builder pattern. -
search()FTS error handling —SessionManager.search()atmanager.ts:347-360usesMATCHdirectly on user input. FTS5MATCHsyntax is strict — characters like*,", or unbalanced parens in the query will throw a SQLite error. Thetools()method wraps the call intry/catch(line 384), which is good, but direct callers ofmanager.search(query)would get an unhandled exception. Worth considering input sanitization or a try/catch insearch()itself. -
delete()doesn't call_ensureReady()—SessionManager.delete()atmanager.ts:185-189callsthis.getSession(sessionId).clearMessages()which triggers_ensureReady()indirectly viagetSession(), but if the session table hasn't been created yet (edge case with builder pattern), the subsequentDELETE FROM assistant_sessionscould fail. Not a practical concern since you'd normally create before deleting, but inconsistent with other lifecycle methods that call_ensureReady()explicitly. -
UIMessagepartstype assertion — Inserver.ts:104, the tool part construction usesas unknown as UIMessage["parts"][number]to force-cast. This works but hides type incompatibilities. If theUIMessagetype changes upstream, this won't produce a compile error. -
compactAndSplitsummary message role — Atmanager.ts:317, the summary is inserted as anassistantrole message with"[Context from previous session]"prefix. This is a design choice, but it means the new session starts with an assistant message before any user message, which some LLM APIs may handle differently.
Test coverage
The 16 DO-backed tests plus the pure unit tests provide solid coverage of the core functionality. The iterative compaction test (testManagerCompactionFiltersSyntheticIds) is particularly valuable — it validates the tricky synthetic ID filtering across multiple compaction rounds.
Overall
The SessionManager is a clean, well-structured registry layer on top of the Session API. The builder pattern mirrors Session.create() nicely, and the lazy init / table creation strategy is sound. The main actionable item is the shared compactFn closure in the example.
| { sessionAffinity: this.sessionAffinity } | ||
| ); | ||
| } | ||
|
|
There was a problem hiding this comment.
The shared compactFn closure captures a previousSummary variable (compaction-helpers.ts:435) that persists across all calls. When session A is compacted, previousSummary gets set to session A's summary. When session B is compacted later, buildSummaryPrompt receives session A's summary as context, cross-contaminating sessions.
Fix: use a per-session compact function via Map<string, ...> keyed by chatId, and lazily create in a getCompactFn(chatId) helper. Then call this.getCompactFn(chatId)(history) at line 135 instead of this.compactFn(history).
| ```bash | ||
| npm install | ||
| npm start | ||
| ``` |
There was a problem hiding this comment.
The README still shows the old ChatAgent class with a single-session Session API (compaction.tokenThreshold, session.append(), session.getMessages()). Since the example now uses MultiSessionAgent + SessionManager, this should be updated to reflect the actual builder pattern and multi-session architecture.
|
Review posted on PR #1167. Here's a summary of what was submitted: PR-level review comment covering the full analysis, and two inline comments:
Additional minor observations in the PR-level comment:
The previously flagged issues (fork parentSessionId, FTS table in _ensureTable, upsert semantics, deleteMessages scoping, fork _touch, env.d.ts naming) were all confirmed as resolved in subsequent commits. |
| delete(sessionId: string): void { | ||
| this.getSession(sessionId).clearMessages(); | ||
| this.agent.sql`DELETE FROM assistant_sessions WHERE id = ${sessionId}`; | ||
| this._sessions.delete(sessionId); | ||
| } |
There was a problem hiding this comment.
🟡 SessionManager.delete() leaks context block data in cf_agents_context_blocks
When SessionManager.delete(sessionId) is called, it clears messages (assistant_messages), compactions (assistant_compactions), FTS entries (assistant_fts), and the session row (assistant_sessions). However, it does not clean up the namespaced context block data stored in cf_agents_context_blocks. Each session creates rows with keys like memory_<sessionId> and _system_prompt_<sessionId> (via AgentContextProvider at packages/agents/src/experimental/memory/session/providers/agent-context.ts:49-53). These rows are orphaned on delete and accumulate over time as sessions are created and destroyed within the same Durable Object.
Prompt for agents
In packages/agents/src/experimental/memory/session/manager.ts, the delete() method at lines 185-189 should also clean up context block data from cf_agents_context_blocks. The context blocks are stored with namespaced keys like `memory_<sessionId>` and `_system_prompt_<sessionId>` (constructed in Session._ensureReady at packages/agents/src/experimental/memory/session/session.ts:110 and session.ts:128). After calling clearMessages(), add SQL to delete all context block rows whose label ends with `_<sessionId>`. For example:
this.agent.sql`DELETE FROM cf_agents_context_blocks WHERE label LIKE ${'%_' + sessionId}`;
Alternatively, have each Session track its context block keys and expose a clearContext() method that the manager can call during deletion.
Was this helpful? React with 👍 or 👎 to provide feedback.
- Add session_id filter to recursive CTE steps in getHistory and getPathLength to prevent cross-session data leakage - Add depth guard (10000) to recursive CTEs to prevent runaway queries - Deduplicate SqlProvider interface (import from ./agent in agent-context) - Update role column in updateMessage alongside content - Populate createdAt in searchMessages via JOIN to assistant_messages - Add comment explaining assistant_config table is reserved for #1167
Core session primitives for the agents package: - Session class with tree-structured messages, compaction overlays, context blocks, FTS5 search - Chainable builder: Session.create(agent).withContext(...).withCachedPrompt() - AgentSessionProvider: SQLite-backed with session_id scoping, content column (Think-compatible) - AgentContextProvider: key-value block storage - ContextProvider interface for custom backends (R2, KV, etc.) - Compaction utilities with head/tail protection - Iterative compaction: newer overlays supersede older ones at same fromId - session-memory example with builder API
591f04d to
c544cf2
Compare
…ession example Registry of named sessions with lifecycle, branching, compaction, search, and tools. **Chainable API**: `SessionManager.create(agent).withContext(...).withCachedPrompt()` auto-wires per-session namespaced providers via `Session.create().forSession(id)`. **Lifecycle**: create, get, list, delete, rename with metadata tracking. **Branching**: `fork(sessionId, atMessageId, name)` with parentSessionId lineage. **Compaction**: needsCompaction, addCompaction, compactAndSplit. **Search**: cross-session FTS via `manager.tools()` → `session_search`. **Tool separation**: session.tools() → update_context, manager.tools() → session_search. **multi-session-agent example** — sidebar with chat list, create/delete, cross-session search. Uses SessionManager builder with Workers AI (Kimi K2.5).
13d9fd3 to
4feb33e
Compare
| if (this._cachedPrompt === true) { | ||
| s.withCachedPrompt(); | ||
| } else if (this._cachedPrompt) { | ||
| s.withCachedPrompt(this._cachedPrompt); | ||
| } |
There was a problem hiding this comment.
🟡 SessionManager.withCachedPrompt(provider) shares one prompt store across all sessions
When a concrete ContextProvider is passed to SessionManager.withCachedPrompt(provider), it is forwarded verbatim to every Session created by getSession() (line 147). This means all sessions share the same prompt store: when session A calls freezeSystemPrompt(), it writes its prompt to the shared store; when session B later calls freezeSystemPrompt(), it reads session A's prompt from that same store (packages/agents/src/experimental/memory/session/context.ts:244-246). This silently causes cross-session prompt contamination. In contrast, when called without arguments (withCachedPrompt()), each session correctly gets its own namespaced AgentContextProvider.
Prompt for agents
In packages/agents/src/experimental/memory/session/manager.ts, the getSession() method at lines 144-148 passes a shared ContextProvider to all sessions when _cachedPrompt is a concrete provider. Either:
1. Remove the ability to pass a concrete provider to SessionManager.withCachedPrompt() (only allow the boolean true case), since per-session namespacing is required, OR
2. Wrap the user-provided provider with per-session namespacing, e.g. by creating a new ContextProvider per session that delegates to the user's provider with a session-scoped key.
Was this helpful? React with 👍 or 👎 to provide feedback.
f8849ca to
616da1c
Compare
…ons coupling Remove non-core features from Think to establish a clean foundation for building back up. The class goes from 950 lines to 720, and the package drops ~2500 lines of code. What was removed from Think class: - Multi-session API: getSessions, createSession, switchSession, deleteSession, renameSession, getCurrentSessionId, getSession, getHistory, getMessageCount, _sessionId field - Extension coupling: getWorkspace(), _hostReadFile/Write/Delete/List methods, Workspace type import from @cloudflare/shell - SessionManager dependency and tree-structured storage (branching, compaction, recursive CTE walks) What replaces sessions: - Inline single-session SQLite persistence with 6 private methods: _initStorage, _loadMessages, _appendMessage, _upsertMessage, _clearMessages, _deleteMessages - Simple flat table: assistant_messages (id, role, content, created_at) - No session_id column, no parent_id, no compaction tables - Placeholder until Matt's Session API lands (PRs #1166, #1167, #1169) What was deleted: - src/transport.ts (AgentChatTransport) — spoke stream-event/stream-done protocol that Think's server doesn't emit. Think speaks CF_AGENT protocol; clients should use useAgentChat or WebSocketChatTransport. - src/session/storage.ts (367 lines) — tree-structured SQLite storage - src/session/index.ts (417 lines) — SessionManager with branching, compaction, truncation utilities - Package exports: ./transport, ./session removed from package.json and build entries What stays unchanged: - Core chat loop: getModel, getSystemPrompt, getTools, getMaxSteps, assembleContext, onChatMessage, onChatError - CF_AGENT wire protocol (same as AIChatAgent) - TurnQueue for turn serialization - Abort/cancel/clear, maxPersistedMessages, configure/getConfig - Sub-agent chat() RPC method, /get-messages HTTP endpoint - Fibers (flag-controlled), multi-tab broadcast - Extensions package exports (standalone, not coupled to Think class) - Workspace tools and execute tool exports Assistant example: - Moved AgentChatTransport into examples/assistant/src/transport.ts (it's specific to the orchestrator relay pattern, not general-purpose) - Updated server.ts to use getMessages() instead of removed APIs Tests: 89 pass (down from 127). Removed 38 tests covering multi-session management, SessionManager internals, truncation utilities, and e2e session management. Core chat, error handling, abort, sanitization, row-size, streaming, persistence, clear, cancel, and agentic loop tests all pass without modification. Made-with: Cursor
…ons coupling Remove non-core features from Think to establish a clean foundation for building back up. The class goes from 950 lines to 720, and the package drops ~2500 lines of code. What was removed from Think class: - Multi-session API: getSessions, createSession, switchSession, deleteSession, renameSession, getCurrentSessionId, getSession, getHistory, getMessageCount, _sessionId field - Extension coupling: getWorkspace(), _hostReadFile/Write/Delete/List methods, Workspace type import from @cloudflare/shell - SessionManager dependency and tree-structured storage (branching, compaction, recursive CTE walks) What replaces sessions: - Inline single-session SQLite persistence with 6 private methods: _initStorage, _loadMessages, _appendMessage, _upsertMessage, _clearMessages, _deleteMessages - Simple flat table: assistant_messages (id, role, content, created_at) - No session_id column, no parent_id, no compaction tables - Placeholder until Matt's Session API lands (PRs #1166, #1167, #1169) What was deleted: - src/transport.ts (AgentChatTransport) — spoke stream-event/stream-done protocol that Think's server doesn't emit. Think speaks CF_AGENT protocol; clients should use useAgentChat or WebSocketChatTransport. - src/session/storage.ts (367 lines) — tree-structured SQLite storage - src/session/index.ts (417 lines) — SessionManager with branching, compaction, truncation utilities - Package exports: ./transport, ./session removed from package.json and build entries What stays unchanged: - Core chat loop: getModel, getSystemPrompt, getTools, getMaxSteps, assembleContext, onChatMessage, onChatError - CF_AGENT wire protocol (same as AIChatAgent) - TurnQueue for turn serialization - Abort/cancel/clear, maxPersistedMessages, configure/getConfig - Sub-agent chat() RPC method, /get-messages HTTP endpoint - Fibers (flag-controlled), multi-tab broadcast - Extensions package exports (standalone, not coupled to Think class) - Workspace tools and execute tool exports Assistant example: - Moved AgentChatTransport into examples/assistant/src/transport.ts (it's specific to the orchestrator relay pattern, not general-purpose) - Updated server.ts to use getMessages() instead of removed APIs Tests: 89 pass (down from 127). Removed 38 tests covering multi-session management, SessionManager internals, truncation utilities, and e2e session management. Core chat, error handling, abort, sanitization, row-size, streaming, persistence, clear, cancel, and agentic loop tests all pass without modification. Made-with: Cursor
…ons coupling Remove non-core features from Think to establish a clean foundation for building back up. The class goes from 950 lines to 720, and the package drops ~2500 lines of code. What was removed from Think class: - Multi-session API: getSessions, createSession, switchSession, deleteSession, renameSession, getCurrentSessionId, getSession, getHistory, getMessageCount, _sessionId field - Extension coupling: getWorkspace(), _hostReadFile/Write/Delete/List methods, Workspace type import from @cloudflare/shell - SessionManager dependency and tree-structured storage (branching, compaction, recursive CTE walks) What replaces sessions: - Inline single-session SQLite persistence with 6 private methods: _initStorage, _loadMessages, _appendMessage, _upsertMessage, _clearMessages, _deleteMessages - Simple flat table: assistant_messages (id, role, content, created_at) - No session_id column, no parent_id, no compaction tables - Placeholder until Matt's Session API lands (PRs #1166, #1167, #1169) What was deleted: - src/transport.ts (AgentChatTransport) — spoke stream-event/stream-done protocol that Think's server doesn't emit. Think speaks CF_AGENT protocol; clients should use useAgentChat or WebSocketChatTransport. - src/session/storage.ts (367 lines) — tree-structured SQLite storage - src/session/index.ts (417 lines) — SessionManager with branching, compaction, truncation utilities - Package exports: ./transport, ./session removed from package.json and build entries What stays unchanged: - Core chat loop: getModel, getSystemPrompt, getTools, getMaxSteps, assembleContext, onChatMessage, onChatError - CF_AGENT wire protocol (same as AIChatAgent) - TurnQueue for turn serialization - Abort/cancel/clear, maxPersistedMessages, configure/getConfig - Sub-agent chat() RPC method, /get-messages HTTP endpoint - Fibers (flag-controlled), multi-tab broadcast - Extensions package exports (standalone, not coupled to Think class) - Workspace tools and execute tool exports Assistant example: - Moved AgentChatTransport into examples/assistant/src/transport.ts (it's specific to the orchestrator relay pattern, not general-purpose) - Updated server.ts to use getMessages() instead of removed APIs Tests: 89 pass (down from 127). Removed 38 tests covering multi-session management, SessionManager internals, truncation utilities, and e2e session management. Core chat, error handling, abort, sanitization, row-size, streaming, persistence, clear, cancel, and agentic loop tests all pass without modification. Made-with: Cursor
…re, rewrite assistant example (#1237) * Extract shared chat primitives into agents/chat, deduplicate ai-chat and think Move streaming, sanitization, and protocol primitives from @cloudflare/ai-chat and @cloudflare/think into a shared `agents/chat` export, eliminating code duplication and establishing a common foundation for both packages. - **message-builder.ts** — `applyChunkToParts()` and `StreamChunkData` type, moved from ai-chat. Think's forked copy (with its "DRIFT RISK" warning) is deleted. Both packages now import from `agents/chat`. - **sanitize.ts** — `sanitizeMessage()` and `enforceRowSizeLimit()`, extracted from Think's standalone implementation. ai-chat wraps these with its extra steps (provider-executed tool truncation, compaction metadata, subclass hook). Think's local `sanitize.ts` is deleted. - **stream-accumulator.ts** — New `StreamAccumulator` class that wraps `applyChunkToParts` and handles metadata chunk types (`start`, `finish`, `message-metadata`, `error`) that the builder doesn't cover. Signals domain-specific concerns (tool approval early persist, cross-message tool updates) via `ChunkAction` returns so callers handle them contextually. - **protocol.ts** — `CHAT_MESSAGE_TYPES` constants for the `cf_agent_chat_*` wire protocol. Think no longer defines local string constants. - **message-reconciler.ts** (new) — Pure functions `reconcileMessages()`, `resolveToolMergeId()`, and `assistantContentKey()` extracted from `AIChatAgent`. The three reconciliation strategies (tool output merge, exact ID + content-key ID matching, toolCallId dedup) are now testable independently of the agent class. ~200 lines removed from index.ts. - **react.tsx** — Replaced `activeStreamRef` + `flushActiveStreamToMessages` (65 lines) with a `StreamAccumulator` ref. The broadcast/resume path in `onAgentMessage` now uses `accumulator.applyChunk()` + functional `setMessages((prev) => accumulator.mergeInto(prev))` updaters instead of manual parts accumulation and metadata merging. - **index.ts** — `_sanitizeMessageForPersistence` now calls shared `sanitizeMessage()` then applies ai-chat-specific steps. Removed `_stripOpenAIMetadata`, `_mergeIncomingWithServerState`, `_reconcileAssistantIdsWithServerState`, `_hasToolCallPart`, `_resolveMessageForToolMerge`, `_assistantMessageContentKey`. Replaced `_byteLength` / `ROW_MAX_BYTES` statics with shared imports. Net reduction of ~370 lines. - `_streamResult()` and `chat()` now use `StreamAccumulator` instead of manual `applyChunkToParts` + metadata switch/case blocks. - Deleted `src/message-builder.ts` (365-line fork) and `src/sanitize.ts` (198 lines). Removed `./message-builder` from package.json exports and build script entries. - Wire protocol constants import from `CHAT_MESSAGE_TYPES` in agents/chat. - `examples/assistant/src/client.tsx` — Updated import from deleted `@cloudflare/think/message-builder` to `agents/chat`. - `design/chat-shared-layer.md` — New design doc covering architecture, module APIs, key decisions, tradeoffs, and deferred work. - `design/AGENTS.md` and `design/think.md` — Updated to reference the new design doc. - **Public API** — No changes to `@cloudflare/ai-chat` or `@cloudflare/ai-chat/react` exports. `AIChatAgent`, `useAgentChat`, `MessageType`, and all types/hooks remain identical. - **`_streamSSEReply`** — Server-side SSE streaming still uses `applyChunkToParts` directly (tightly coupled with `_streamingMessage` shared reference used by `hasPendingInteraction`, `_messagesForClientSync`, and `_findAndUpdateToolPart`). - **Hibernation/resume paths** — `_persistOrphanedStream`, `ResumableStream`, `_restoreRequestContext`, message loading from SQLite all unchanged. - **Turn queue / concurrency** — `_runExclusiveChatTurn`, `_chatEpoch`, concurrency policies all unchanged. TurnQueue extraction deferred. - **Wire protocol** — No changes to message types or payloads. - ai-chat workers: 36 files, 340 tests passing - ai-chat React: 1 file, 38 tests passing - Think: 7 files, 126 tests passing - All 69 projects typecheck successfully - npm run check passes (format, lint, typecheck, export checks) 17 files changed, ~120 insertions, ~1578 deletions (net -1458 lines) Made-with: Cursor * Add unit tests for StreamAccumulator and MessageReconciler Comprehensive pure-function unit tests for the two modules extracted in the shared layer refactoring. Both test suites run in Node (no Workers pool overhead) and cover every public method, chunk type, reconciliation strategy, and edge case. ## StreamAccumulator tests (57 tests, 8ms) New vitest project `chat` in packages/agents with Node environment. Coverage: - Text lifecycle (start/delta/end, resumption fallback, multiple segments) - Reasoning lifecycle (start/delta/end, resumption fallback) - File, source-url, source-document chunks - step-start / start-step aliasing - data-* chunks (append, reconcile by type+id, transient skip, no-id append) - Full tool lifecycle (input-start/delta/available/error, output-available/error/denied) - tool-approval-request action signaling (with and without matching part) - Cross-message tool update detection (output-available, output-error, preliminary flag) - Metadata chunks (start, finish with finishReason, message-metadata, finish-step, error) - Continuation mode (existing parts, messageId preservation, metadata carry-forward) - toMessage() snapshots (immutability, metadata inclusion/omission) - mergeInto() (replace by ID, append, continuation fallback, empty array, exact match preferred over backward walk, input immutability) - Unrecognized chunk types ## MessageReconciler tests (27 tests) Added to packages/ai-chat/src/tests/ (existing Workers pool). Coverage: - Tool output merge (input-available, approval-requested, approval-responded, passthrough, non-assistant, already output-available) - ID reconciliation (exact match, content-key match, identical content #1008, tool-bearing skip, empty server, no matches, sanitize callback) - Composed stages (tool merge + ID reconciliation in single call) - Mixed tool + text parts treated as tool-bearing - resolveToolMergeId (matching/non-matching/same ID, non-assistant, empty server, multiple tool parts first-match-wins) - assistantContentKey (assistant/user/system, sanitize callback, identical/different content) Made-with: Cursor * Extract TurnQueue into agents/chat for shared turn serialization Add TurnQueue — a serial async queue with generation-based invalidation — to packages/agents/src/chat/. Both AIChatAgent and Think now use it to serialize chat turns, replacing duplicated scheduling machinery. TurnQueue provides: - Promise-chain FIFO serialization via enqueue() - Generation counter with reset() for invalidating stale queued work - Auto-skip: turns enqueued under an older generation are not executed - Active request tracking (activeRequestId, isActive) - waitForIdle() that resolves when the queue fully drains - Per-generation queued count tracking for concurrency policy decisions AIChatAgent refactoring: - Remove _chatTurnQueue, _activeChatTurnRequestId, _chatEpoch, and _queuedChatTurnCountsByEpoch fields (replaced by _turnQueue) - _runExclusiveChatTurn becomes a thin wrapper around _turnQueue.enqueue that handles the onChatResponse drain and merge-map cleanup - Add onStale callback so the WS submit path can send done:true for turns auto-skipped after clear - All 17 _chatEpoch references replaced with _turnQueue.generation - Concurrency policies (drop/latest/merge/debounce) stay in AIChatAgent Think adoption: - Replace _clearGeneration with _turnQueue, giving Think proper turn serialization for the first time (concurrent chat()/WS calls could previously interleave on this.messages) - Wrap chat() and _handleChatRequest bodies in _turnQueue.enqueue() - Align _handleClear order with AIChatAgent: reset queue before aborting controllers so queued turns can't slip through between abort and reset Tests: 17 TurnQueue unit tests covering serialization, generation skip, waitForIdle, queuedCount, error propagation, reset during active execution, and explicit generation options. All 405 ai-chat and 126 Think integration tests pass without modification. Made-with: Cursor * Extract broadcast stream state machine into agents/chat Add broadcastTransition — a pure state machine that manages the StreamAccumulator lifecycle for broadcast/resume streams. This is the client-side path where a tab observes a stream owned by another tab or resumed after reconnect, as opposed to the transport-owned path that feeds directly into useChat. The state machine has two states (idle, observing) and three event types (response, resume-fallback, clear). It handles: - Accumulator creation on first chunk for a new stream - Continuation context: walks currentMessages backwards to find the last assistant message's parts/metadata for append-mode streams - Chunk application via StreamAccumulator.applyChunk - Replay suppression: replay=true chunks accumulate silently, replayComplete triggers a batch flush - Done/error cleanup: final merge into messages, transition to idle - Stream replacement: new streamId creates a fresh accumulator - Clear: immediate transition to idle, no messages update Refactor useAgentChat in react.tsx: - Replace accumulatorRef + activeStreamIdRef (two refs) with a single streamStateRef holding the discriminated union - CF_AGENT_USE_CHAT_RESPONSE handler: 95 lines of interleaved accumulator management reduced to ~30 lines of parse + dispatch - CF_AGENT_STREAM_RESUMING fallback: manual accumulator creation replaced with broadcastTransition resume-fallback event - CF_AGENT_CHAT_CLEAR: now resets broadcast state alongside messages - Fix: body parse errors no longer prevent done/error handling (removed early return in catch block that skipped stream cleanup) The module lives in agents/chat alongside StreamAccumulator and TurnQueue, with no React or WebSocket dependencies. Think can adopt it when it needs multi-tab broadcast support — the wire protocol is already aligned (CHAT_MESSAGE_TYPES). Tests: 16 unit tests covering all transitions. All 405 ai-chat integration tests pass without modification. Made-with: Cursor * Strip Think to minimal core: single-session, no transport, no extensions coupling Remove non-core features from Think to establish a clean foundation for building back up. The class goes from 950 lines to 720, and the package drops ~2500 lines of code. What was removed from Think class: - Multi-session API: getSessions, createSession, switchSession, deleteSession, renameSession, getCurrentSessionId, getSession, getHistory, getMessageCount, _sessionId field - Extension coupling: getWorkspace(), _hostReadFile/Write/Delete/List methods, Workspace type import from @cloudflare/shell - SessionManager dependency and tree-structured storage (branching, compaction, recursive CTE walks) What replaces sessions: - Inline single-session SQLite persistence with 6 private methods: _initStorage, _loadMessages, _appendMessage, _upsertMessage, _clearMessages, _deleteMessages - Simple flat table: assistant_messages (id, role, content, created_at) - No session_id column, no parent_id, no compaction tables - Placeholder until Matt's Session API lands (PRs #1166, #1167, #1169) What was deleted: - src/transport.ts (AgentChatTransport) — spoke stream-event/stream-done protocol that Think's server doesn't emit. Think speaks CF_AGENT protocol; clients should use useAgentChat or WebSocketChatTransport. - src/session/storage.ts (367 lines) — tree-structured SQLite storage - src/session/index.ts (417 lines) — SessionManager with branching, compaction, truncation utilities - Package exports: ./transport, ./session removed from package.json and build entries What stays unchanged: - Core chat loop: getModel, getSystemPrompt, getTools, getMaxSteps, assembleContext, onChatMessage, onChatError - CF_AGENT wire protocol (same as AIChatAgent) - TurnQueue for turn serialization - Abort/cancel/clear, maxPersistedMessages, configure/getConfig - Sub-agent chat() RPC method, /get-messages HTTP endpoint - Fibers (flag-controlled), multi-tab broadcast - Extensions package exports (standalone, not coupled to Think class) - Workspace tools and execute tool exports Assistant example: - Moved AgentChatTransport into examples/assistant/src/transport.ts (it's specific to the orchestrator relay pattern, not general-purpose) - Updated server.ts to use getMessages() instead of removed APIs Tests: 89 pass (down from 127). Removed 38 tests covering multi-session management, SessionManager internals, truncation utilities, and e2e session management. Core chat, error handling, abort, sanitization, row-size, streaming, persistence, clear, cancel, and agentic loop tests all pass without modification. Made-with: Cursor * Move ResumableStream to agents/chat and wire stream resumption into Think Move ResumableStream from ai-chat to agents/chat as a shared primitive, alongside TurnQueue, StreamAccumulator, and broadcastTransition. Add resume protocol constants (STREAM_RESUMING, STREAM_RESUME_ACK, STREAM_RESUME_REQUEST, STREAM_RESUME_NONE) to CHAT_MESSAGE_TYPES. This gives Think page-refresh survival for in-flight streams — the single biggest feature gap between Think and AIChatAgent for production use. When a user refreshes during streaming, the client reconnects, the server replays all buffered chunks from SQLite, then continues with live chunks. ResumableStream move: - Replace MessageType import with CHAT_MESSAGE_TYPES from protocol.ts - Export ResumableStream and SqlTaggedTemplate from agents/chat barrel - Update ai-chat imports (index.ts + tests/worker.ts) to use agents/chat - Delete ai-chat/src/resumable-stream.ts (now in agents/chat) Think resume wiring (~70 lines added): - _resumableStream field initialized in onStart after storage - onConnect wrapped: sends STREAM_RESUMING when active stream exists - onClose wrapped: cleans up _pendingResumeConnections - _handleProtocol: STREAM_RESUME_REQUEST (notify or NONE), STREAM_RESUME_ACK (remove from pending, replay chunks, handle orphaned) - _streamResult: start() before loop, storeChunk() per chunk, complete() on success, markError() on error/finally - _broadcastChat: new method that excludes _pendingResumeConnections from live broadcasts (used for MSG_CHAT_RESPONSE only) - _notifyStreamResuming: adds connection to pending set, sends RESUMING - _persistOrphanedStream: reconstructs partial assistant message from stored chunks via StreamAccumulator after DO hibernation - _handleClear: clears resumable stream and pending connections Protocol flow: 1. Stream starts → ResumableStream.start() tracks in SQLite 2. Each chunk → storeChunk() buffers, _broadcastChat() excludes pending 3. Client reconnects → onConnect sends STREAM_RESUMING 4. Client ACKs → replayChunks() sends stored chunks with replay: true 5. If orphaned (DO hibernated) → replayChunks returns streamId, _persistOrphanedStream reconstructs and persists the partial message 6. Stream ends → complete(), pendingResumeConnections cleared What's NOT included (Think doesn't need these): - Auto-continuation for client-side tools (no client tools yet) - _awaitingStreamStartConnections (no deferred continuations) - Tool approval / interaction machinery All 405 ai-chat tests and 89 Think tests pass. Made-with: Cursor * Add client-side tool support to Think with debounce-based auto-continuation Move ClientToolSchema and createToolsFromClientSchemas to agents/chat as shared primitives. Add TOOL_RESULT, TOOL_APPROVAL, MESSAGE_UPDATED protocol constants. Update ai-chat to re-export from agents/chat (no public API break). Think now supports client-defined tools that execute in the browser: Protocol handling: - CF_AGENT_TOOL_RESULT: find tool part by toolCallId, update state to output-available (or output-error with errorText), persist, broadcast MESSAGE_UPDATED. Idempotent: skips tools already in output-available or output-denied state. Accepts results from input-available, approval-requested, and approval-responded states. - CF_AGENT_TOOL_APPROVAL: transition to approval-responded (approved) or output-denied (rejected). Preserves existing approval data (e.g. approval.id for model providers that need it). - clientTools parsed from chat request body, stored in _lastClientTools, passed to onChatMessage via ChatMessageOptions.clientTools. - clientTools from CF_AGENT_TOOL_RESULT update stored tools (for reconnect scenarios where the client sends tools with the result). - Explicit empty clientTools array clears stored tools; absent field leaves them unchanged. Auto-continuation (debounce-based): - When autoContinue is true on a tool result or approval, schedule a continuation turn after a 50ms debounce window. - Multiple rapid tool results coalesce into a single continuation. - _runAutoContinuation wraps in keepAliveWhile to prevent DO hibernation during long LLM calls. - Continuation calls onChatMessage with the stored clientTools so the model can call client tools again in multi-step flows. Default onChatMessage updated: - Merges client tools via createToolsFromClientSchemas alongside server tools from getTools() and per-turn tools from options.tools. - ChatMessageOptions extended with optional clientTools field so subclasses that override onChatMessage can access client schemas. Clear cleanup: - _handleClear clears _lastClientTools and cancels any pending auto-continuation timer. Tests: 23 new tests in client-tools.test.ts covering: - Tool result application (7): output-available, output-error, default errorText, idempotent guards for output-available/output-denied, applies from approval-requested/approval-responded states - Tool approval (5): approve, reject, non-existent ID no-op, idempotent guard, preserves approval data - Auto-continuation (4): autoContinue triggers continuation, without autoContinue no continuation, approval + autoContinue, rejection + autoContinue - Client tool schemas (4): schemas from chat request, schemas from TOOL_RESULT, clear clears schemas, empty clientTools clears - Broadcast and persistence (3): MESSAGE_UPDATED broadcast, tool state survives across instances, other tabs receive continuation stream All 112 Think tests and 405 ai-chat tests pass. Made-with: Cursor * Add MCP integration, onConnect message push, update docs Small features and documentation cleanup to finalize Think's core. MCP integration: - Add waitForMcpConnections field (boolean | { timeout: number }, default false). When enabled, Think waits for MCP server connections before calling onChatMessage, ensuring this.mcp.getAITools() returns the full set of MCP-discovered tools. - Wired into _handleChatRequest before the onChatMessage call. Matches AIChatAgent's pattern. onConnect message push: - New WebSocket connections immediately receive CF_AGENT_CHAT_MESSAGES with the current message list. Clients no longer need to wait for the next broadcast or separately fetch /get-messages. - Sent before the user's onConnect handler, after stream resume check. README rewrite: - Remove references to deleted features: AgentChatTransport, session management API, getWorkspace(), multi-session. - Add documentation for new features: client tools, MCP integration, stream resumption, messages on connect, auto-continuation. - Update quick start example (simpler, no workspace dependency). - Update exports table (removed ./session and ./transport). - Update production features list. - Mark @cloudflare/shell as optional peer dependency (only needed for workspace tools, not core Think). Design doc update: - Architecture diagram: add resumable-stream.ts, client-tools.ts, updated protocol.ts description. Remove session/ and transport.ts from Think. Remove resumable-stream.ts from ai-chat. - History: document all extractions (ResumableStream move, client tool primitives move, Think strip-down, MCP + onConnect additions). package.json: - Update description to match current scope. - Mark @cloudflare/shell as optional peer dependency. Made-with: Cursor * Rewrite assistant example on Think, harden Think's chat pipeline The assistant example was a ~3,000-line multi-agent orchestrator with a custom transport, codemode/shell execution, and streamdown rendering. Replace it with a ~700-line single-agent Think demo that exercises the same features users care about — workspace tools, MCP, server/client tools, tool approval, and stream resumption — without the orchestrator complexity. examples/assistant: - server.ts: MyAssistant extends Think<Env> (~150 lines). Overrides getModel, getSystemPrompt, getTools. Workspace tools from @cloudflare/shell, weather tool, getUserTimezone (client-side), calculate (needsApproval), MCP integration via waitForMcpConnections. Two @callable() methods for MCP management from the client. - client.tsx: Standard useAgent + useAgentChat (~540 lines). Kumo UI with streaming text/reasoning parts, tool output, approval flow, MCP server panel, dark mode, error banner. Helper shouldShowStreamedTextPart() handles empty-text streaming parts. clearError() on send and clear to avoid stale error banners. - Delete transport.ts (custom AgentChatTransport no longer needed). - Remove @ai-sdk/react, @cloudflare/codemode, streamdown, @streamdown/code deps; add @cloudflare/ai-chat. - Remove worker_loaders/LOADER binding, __filename define. - Trim styles.css (drop streamdown theme classes). - Update README to describe the new single-agent architecture. packages/think/src/think.ts: - waitForMcpConnections: boolean true now defaults to 10s timeout instead of undefined (prevents indefinite hangs when MCP servers are slow or unreachable). - _handleChatRequest: reload this.messages after MCP wait and before calling onChatMessage so assembleContext() always sees fresh data. - onChatMessage: validate assembleContext() result is non-empty and throw a descriptive error instead of letting streamText hit "messages must not be empty" from the provider. - Doc comment fix for waitForMcpConnections default value. Made-with: Cursor * Fix broadcastTransition treating mid-stream errors as terminal broadcastTransition transitioned to idle when event.error was true, even without event.done. Both Think (_streamResult) and AIChatAgent (_streamSSEReply) send mid-stream error chunks as {done: false, error: true} then continue processing — the stream eventually ends with a separate {done: true} message. On the broadcast path (cross-tab observers), this caused: 1. Mid-stream error → transition to idle, flush accumulated content 2. Final done:true → state is idle, new empty accumulator created → mergeInto appends a spurious empty assistant message Fix: only check event.done for the terminal transition. When the final message is {done: true, error: true} (terminal error), done still triggers idle correctly. When {done: false, error: true} (mid-stream error), the stream stays in observing and content is properly flushed on the eventual done:true. Tests: update terminal error test to use done+error, add 4 new tests covering mid-stream error stays observing, mid-stream error followed by done produces single message, no duplicate after mid-stream error, and multiple mid-stream errors all stay observing. Made-with: Cursor * Add onStale callback to _queueAutoContinuation for defensive cleanup When a chat clear advances the TurnQueue generation while an auto-continuation turn is queued, the turn is skipped (never executed). resetTurnState() already calls _clearAllAutoContinuationState(true) synchronously during the clear, so this is not a current bug — but adding the onStale callback ensures cleanup happens even if the generation advances through a path that doesn't go through resetTurnState. Made-with: Cursor * Fix stale turn in Think leaving client transport stream stuck When _handleClear advances the TurnQueue generation while a chat turn is queued, enqueue returns { status: "stale" } and fn is never called. No MSG_CHAT_RESPONSE with done:true was sent for that requestId, leaving the client's WebSocketChatTransport ReadableStream open indefinitely and useChat status stuck at "submitted". Fix: capture the TurnResult from enqueue and send a done:true response when the turn was stale. Only needed for _handleChatRequest (WebSocket path) — chat() is RPC-based (no transport stream) and _runAutoContinuation is server-initiated (broadcast path, no transport stream waiting). Mirrors AIChatAgent's onStale → _completeSkippedRequest pattern. Made-with: Cursor * Fix _upsertMessage reordering chat history via INSERT OR REPLACE INSERT OR REPLACE deletes and reinserts the row, giving it a fresh created_at timestamp. Since _loadMessages orders by created_at ASC, updating an existing message (tool result, approval, persist after stream) moved it to the end of the conversation. Use ON CONFLICT(id) DO UPDATE SET content instead, which updates in place and preserves the original created_at. Made-with: Cursor * Capture client tools before entering turn queue to prevent race _lastClientTools was set from the request body before entering the turn queue but read inside the enqueued fn. During awaits inside fn (MCP wait, agentContext.run), the DO input gate can open and a concurrent _handleChatRequest can overwrite _lastClientTools, causing the first request's onChatMessage to receive the wrong client tool schemas. Fix: capture _lastClientTools into a local before enqueueing. _lastClientTools is still updated (for auto-continuation which reads it later), but the turn uses its own snapshot. Made-with: Cursor * Reload messages in _runAutoContinuation before calling onChatMessage Auto-continuation waits in the TurnQueue behind any active turn. By the time fn executes, this.messages can be stale — tool results applied via _applyToolResult update storage but the in-memory array is a snapshot from before the debounce timer fired. Without a reload, assembleContext() sends outdated context to the LLM, potentially missing tool results that arrived between scheduling and execution. Matches _handleChatRequest which already reloads at the same point. Made-with: Cursor * Use _broadcastChat for all MSG_CHAT_RESPONSE sends in _handleChatRequest The error handler and "no response" path used _broadcast instead of _broadcastChat, bypassing _pendingResumeConnections exclusions. Clients mid-resume (waiting for ACK) would receive these responses, and if the streamId differed from the resume stream, the broadcast state machine would discard the resume accumulator's content. Made-with: Cursor
Summary
Multi-session registry built on the Session API from #1166. A single Durable Object can manage multiple independent chat sessions, each with its own messages, context blocks, and compaction history.
Stacked on #1166 — review that first.
Chainable Builder API
What's in the box
SessionManager (
manager.ts)Lifecycle —
create(name, opts?),get(id),list(),delete(id),rename(id, name)assistant_sessionstable with metadata: model, source, token counts, estimated cost, timestampslist()ordered byupdated_at DESCdelete()clears messages and FTS entries before removing the session rowMessage convenience methods — all delegate to the underlying
Sessionand call_touch()to updateupdated_atappend(sessionId, message, parentId?)upsert(sessionId, message, parentId?)— check-then-update, actually updates existing messages (not INSERT OR IGNORE which silently drops)appendAll(sessionId, messages, parentId?)deleteMessages(sessionId, messageIds)— scoped to a specific session, not broadcast across all cached sessionsclearMessages(sessionId)getHistory(sessionId, leafId?)Branching
fork(sessionId, atMessageId, newName)— creates a new session withparentSessionIdlineage, copies the message path up toatMessageIdwith fresh UUIDs, calls_touch()so the fork sorts correctly inlist()getBranches(sessionId, messageId)— convenience wrapperCompaction
needsCompaction(sessionId)— delegates toSession.needsCompaction(maxContextMessages)addCompaction(sessionId, summary, fromId, toId)/getCompactions(sessionId)compactAndSplit(sessionId, summary, newName?)— marks old session withend_reason = 'compaction', creates new session with summary as first message. Useful for "fresh start" compaction strategyUsage tracking —
addUsage(sessionId, inputTokens, outputTokens, cost)for cost monitoringCross-session search —
search(query, { limit? })uses the sharedassistant_ftstabletools()returnssession_searchtool for AI to search across all sessionsLazy init pattern
Same pattern as
Session.create()—SessionManager.create()usesObject.create(SessionManager.prototype)to bypass the constructor, storing pending config._ensureReady()resolves on first use. The_ensureTable()method creates bothassistant_sessionsANDassistant_ftstables so cross-session search works even before any session'sAgentSessionProviderhas initialized.multi-session-agent example
Full working example with:
update_contexttoolsession_searchtoolcompaction_IDs, usesgetCompactions()[0].fromMessageIdfor superseding overlays)Design decisions for reviewers
deleteMessages(sessionId, messageIds)takes a session ID — Devin flagged the original design which broadcast deletes across all in-memory cached sessions. This was wrong: uncached sessions were silently skipped, and it didn't call_ensureReady(). Now targets a specific session directly.fork()calls_touch()— Without this, forked sessions had theirupdated_atstuck at creation time, causing them to sort incorrectly inlist()(which orders byupdated_at DESC).upsert()uses check-then-update — NotINSERT OR IGNOREwhich silently drops updates to existing messages. Actually callsupdateMessage()if the message already exists.Tool separation —
session.tools()returnsupdate_context(per-session context blocks),manager.tools()returnssession_search(cross-session FTS). Combined with spread:{ ...await session.tools(), ...manager.tools() }.FTS table in
_ensureTable()— The manager creates theassistant_ftsvirtual table alongsideassistant_sessions, sosearch()works even if no session'sAgentSessionProviderhas been initialized yet.No changeset — This is under
experimental/— API will change between releases.Stack
mainTest plan
16 DO-backed tests covering:
session_searchtoolAgentContextProviderpersistencedeleteMessagestargets specific session — deletes from one session, verifies other is unaffectedforkupdatesupdated_at— verifies timestamp, parent lineage, and copied message count