Skip to content

Hooks spec (work in progress)#3055

Open
mike-inkeep wants to merge 1 commit intomainfrom
feat/hooks
Open

Hooks spec (work in progress)#3055
mike-inkeep wants to merge 1 commit intomainfrom
feat/hooks

Conversation

@mike-inkeep
Copy link
Copy Markdown
Contributor

No description provided.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 7, 2026

⚠️ No Changeset found

Latest commit: 27f57c0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 7, 2026 9:08pm
agents-docs Ready Ready Preview, Comment Apr 7, 2026 9:08pm
agents-manage-ui Ready Ready Preview, Comment Apr 7, 2026 9:08pm

Request Review

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 7, 2026

TL;DR — Adds a technical specification for hook middleware: TypeScript functions that run at defined lifecycle points during agent generation (before tool args, after tool results, before sub-agent delegation). Hooks can inspect/mutate data, invoke MCP tools, read/write a conversation-scoped variable bag, write to the artifact store, and abort generation entirely. Includes a companion use-case catalog covering result truncation, credential injection, policy gates, output routing, and sub-agent context prefetch.

Key changes

  • Add SPECS.md — full hook middleware technical specification — Defines three hook points (before-args, after-result, before-delegation), TypeScript interfaces, database schema (hook_definitions + hooks tables), execution via SandboxExecutorFactory, variable bag storage, artifact store extensions, ProjectMcpAccessor for MCP access, multi-hook ordering semantics, and error handling policy.
  • Add USECASES.md — hook use-case catalog — Documents concrete scenarios across six categories: result-too-big transforms, sub-agent context prefetch, arg injection/enrichment, policy gates, output routing to artifacts, and composed multi-hook pipelines. Includes a "required mechanisms" reference table mapping each mechanism to its use cases.

Summary | 2 files | 1 commit | base: mainfeat/hooks


Hook lifecycle points and type contracts

Before: No mechanism to intercept or transform tool calls, tool results, or sub-agent delegations during the generation loop.
After: Three named hook points — before-args, after-result, before-delegation — each with typed context/result interfaces and action discriminated unions (proceed, block, abort).

Each hook point receives a context object with relevant metadata (ToolMeta, ConversationMeta, VariableBag, ProjectMcpAccessor) and returns a result union. before-args uniquely supports block (short-circuit with synthetic result); all three support abort (terminate generation entirely). Hooks execute in SandboxExecutorFactory using the same node22 sandbox infrastructure as function tools.

How does abort differ from block?

block is before-args-only: it injects a synthetic result string back to the LLM and lets reasoning continue. abort terminates the entire generation loop immediately — no further tool calls, LLM steps, or delegations. The reason string surfaces as the terminal message.

SPECS.md


Database schema and execution model

Before: No storage or execution infrastructure for hooks.
After: Two-table schema (hook_definitions for reusable logic, hooks for attachment/scope/ordering) with explicit scope enum, check constraints, per-hook onError policy, and SandboxConfig overrides.

hook_definitions stores executeCode + dependencies at project scope. hooks attaches a definition to a hook point with explicit scope ('project' | 'agent'), enforced by a check constraint tying agentId presence to scope value. Execution order is deterministic: project-level hooks first, then agent-level, ascending by order within each tier. Error handling is per-hook via onError: 'abort' | 'proceed' (default 'proceed').

SPECS.md


Variable bag and artifact store extensions

Before: No cross-hook mutable state; artifact store only supports tool-output artifacts keyed by toolCallId.
After: Conversation-scoped VariableBag (jsonb on conversations table) with get/set/delete/getAll; ArtifactAccessor with write, append, and read by caller-chosen string key.

Variable bag writes are batched in memory across hooks within a tool call and flushed once after completion. Artifact store gains hook-initiated writes with a source: 'hook' marker, append support, and key-based reads — enabling patterns like incremental report assembly and raw-content offloading from context.

SPECS.md


Use-case catalog

Before: No documentation of hook patterns or required platform mechanisms.
After: Six categories of use cases with concrete examples, plus a mechanism-to-use-case reference table.

Covers result truncation (PR diffs, SQL rows, log searches), sub-agent context prefetch (billing status, security docs, tenant feature flags), arg injection (credential vault, tenant scoping, timestamp normalization), policy gates (audit mode, budget guards, path scope), output routing (HTML extraction, report assembly, search deduplication), and composed multi-hook pipelines.

USECASES.md

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid foundation — the hook-point taxonomy, schema split mirroring functions/function_tools, and variable bag design are well thought out. There are a few internal contradictions and spec gaps worth resolving before implementation begins. The highest-priority items are the sandbox-vs-"no sandbox" contradiction and the missing before-argsblock action in USECASES.md.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏


Hooks are TypeScript functions registered at project configuration time by a project superuser. They execute during the agent generation loop at defined lifecycle points, with the ability to inspect and mutate data, invoke MCP tools, make external calls, read/write the variable bag, write to the artifact store, and terminate generation entirely.

Hook code runs in the same sandbox infrastructure used by function tools (`SandboxExecutorFactory`, `node22` runtime). Hook authors are project superusers — the execution model mirrors function tools: stored as `executeCode` + `dependencies` in the database, executed via `SandboxExecutorFactory` with a configurable `SandboxConfig`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradiction with USECASES.md line 89. This paragraph says hooks run in the sandbox (SandboxExecutorFactory, node22), but USECASES.md line 89 says: "Hook authors are trusted; no sandbox." These are mutually exclusive execution models with different security postures. Which is canonical?

The sandbox model is safer and consistent with how function tools work today. If hooks genuinely need unrestricted Node.js access (e.g. for direct fetch or vault calls), the spec should explain how those capabilities are exposed within the sandbox — e.g. by injecting an http client or a vault accessor into the hook context, rather than abandoning the sandbox entirely.


type BeforeArgsResult =
| { action: 'proceed'; args?: Record<string, unknown> } // continue, optionally with mutated args
| { action: 'block'; result: string } // short-circuit: return synthetic result to LLM
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block action returns result: string, but tool results in the existing codebase are unknown (see AfterResultResult.result). If a hook blocks a tool call to inject a synthetic result, limiting to string means the LLM won't see structured JSON unless the hook author manually serializes it. Consider making this result: unknown for consistency, or document the rationale for restricting to string.

}

type AfterResultResult =
| { action: 'proceed'; result?: unknown } // continue, optionally with transformed result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no block action on AfterResultResult — only proceed and abort. This means an after-result hook can transform the result or kill the generation, but cannot suppress the result (i.e. replace it with nothing, or with a short summary while storing the full content in artifacts). The use cases in USECASES.md ("Output Routing") rely on exactly this: replacing a large tool result with a short summary string. As written, { action: 'proceed', result: 'summary string' } achieves this, so the omission may be intentional — but it's worth explicitly noting that result replacement is done via proceed + a new result, not via a separate block action, since before-args does have block.

- **`hooks`** — attachment table. Ties a `hookDefinitionId` to a hook point and execution context. Scope is declared explicitly via a non-nullable `scope` column — never inferred from `NULL`.

```
hook_definitions (tenantId, projectId, id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hook_definitions table has executeCode and dependencies but is missing columns the existing functions table has: specifically inputSchema. If hook definitions are meant to be reusable across different hook attachments, a schema for the expected context shape could help with validation and UI. More concretely: should hook_definitions also carry name and description columns for discoverability in the management UI? The functions table doesn't have them (they live on function_tools), but hooks may need them on the definition itself since a single definition can be attached at multiple points.

scope: 'project' | 'agent' -- non-nullable; intent always explicit
agentId: varchar(256) NULLABLE -- required when scope = 'agent', must be null when scope = 'project'
hookPoint: 'before-args' | 'after-result' | 'before-delegation'
order: integer
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hooks table has no name or description column. Existing function_tools inherits ...uiProperties (name, description). Without a name, the management UI has no way to label hooks in a list — the user would see only IDs and hook points. Consider adding name (non-nullable) and optionally description.

interface SubAgentPayload {
systemPrompt?: string;
additionalContext?: string;
messages?: unknown[];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

messages?: unknown[] is very loosely typed. In the existing codebase, messages follow the OpenAI chat completions message shape. Using unknown[] here means hook authors have no type guidance for what they're injecting. Consider referencing the existing message type or at minimum noting the expected shape.

## Policy / Control Flow Gates

**Audit mode — block all write tools**
A user puts an agent in "audit mode." Any tool with a write side-effect (`create_issue`, `update_record`, `send_message`) is blocked by a hook. The agent can still read and reason; it just can't act. The hook returns `{ proceed: false, reason: "agent is in read-only mode" }` and the platform feeds that back as the tool result.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use case says the hook returns { proceed: false, reason: ... } but the spec defines the block action as { action: 'block', result: string }. The shapes don't match. The use case should use spec-consistent types: { action: 'block', result: 'agent is in read-only mode' }.

| Before-args hook point | Fires after the LLM generates a tool call but before the tool executes; can inspect and mutate arguments or block execution entirely | Credential injection, tenant scoping, timestamp normalization, path scope |
| After-result hook point | Fires after the tool returns its result but before that result enters the LLM's context window; can transform or replace the output | All result-too-big, all output routing |
| Before-delegation hook point | Fires when a supervisor is about to hand off to a sub-agent, before the sub-agent receives any input; can modify what the sub-agent starts with | All sub-agent prefetch cases |
| Result replacement (`proceed: false`) | The platform contract by which a hook short-circuits tool execution and injects an arbitrary string back to the LLM as if it were the tool's result | All gate cases, output routing short confirmations |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This row says "Result replacement (proceed: false)" but the spec uses { action: 'block', result: string } — there is no proceed: false in the defined types. This table should use spec-consistent terminology.

| External call capability | Hooks can make outbound network requests (HTTP, vault lookups, internal APIs) during execution — not just transform data already in memory | Prefetch cases, credential injection |
| MCP tool access from hooks | Hooks can invoke tools through the MCP protocol using the MCP SDK, not just raw HTTP — enabling structured tool calls with schema validation and the full MCP tool ecosystem from within a hook | Google Calendar context fetcher, any prefetch case using MCP-registered tools |
| Flow-level exit / abort | A hook can terminate the entire generation loop — no further steps execute. This is not scoped to a single tool call or delegation; it fully stops the running agent. Can be triggered from any hook point (before-args, after-result, before-delegation). | Calendar-gate context fetcher (no meetings → stop entirely), budget exceeded, policy violation |
| Programmatic hook | Hooks are TypeScript functions registered at config time by a project superuser. They run in-process with full Node.js access — conditionals, external calls, loops, MCP tool invocations — not just declarative transforms. Hook authors are trusted; no sandbox. | Budget guard, deduplication, calendar exit gate, credential injection |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Hook authors are trusted; no sandbox." — contradicts SPECS.md line 7 which says hooks run in SandboxExecutorFactory with a node22 runtime. This needs to be reconciled. See comment on SPECS.md line 7.

| Sub-agent delegation context injection | The ability to append content to the payload a sub-agent receives at startup, distinct from transforming a tool result | All prefetch cases |
| Multi-hook ordering / chaining | When multiple hooks apply to the same event, the platform executes them in a deterministic order, passing each hook's output as the next hook's input | Composed pipeline |
| Variable bag | A mutable key-value store scoped to the conversation, readable and writable by any hook, invisible to the LLM. Stored as a `jsonb` column on the `conversations` table. Task-scoped variable bags are a v2 consideration for cases where state should not bleed between turns. | Budget guard, deduplication, tenant ID propagation |
| MCP tool access from hooks | Hooks have access to all MCP tools configured on the project (not scoped to the invoking agent). The hook author is a project superuser and can invoke any registered MCP tool. Requires a project-level MCP accessor distinct from the per-agent `AgentMcpManager`. | Google Calendar context fetcher, any prefetch case using MCP-registered tools |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate row: "MCP tool access from hooks" appears twice in this table (lines 87 and 94). The descriptions are slightly different — line 87 emphasizes MCP SDK structured tool calls, line 94 emphasizes project-level scoping. Merge these into one row.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(8) Total Issues | Risk: Medium

This is a well-structured spec for a powerful middleware system. The core design is solid — the three hook points (before-args, after-result, before-delegation) cover the key lifecycle moments, and the discriminated union action vocabulary is clean. However, there are several document inconsistencies and architectural considerations worth addressing before implementation.


🟠⚠️ Major (4) 🟠⚠️

🟠 1) SPECS.md:115-135 Schema design diverges from established functions/function_tools pattern

Issue: The spec proposes hook_definitions (project-scoped) + hooks (attachment table with scope column and CHECK constraint), but the existing functions/function_tools pattern uses separate tables for different scopes rather than a scope column with conditional nullability.

Why: This creates two competing patterns for the same concept (reusable code definitions attached to execution contexts). Future maintainers will need to understand both patterns, and there's no guidance on which to use for new features.

Fix: Either (1) align with the existing pattern by creating separate project_hooks and agent_hooks tables that both reference hook_definitions, or (2) explicitly document why the scope-column approach is intentionally chosen as a better pattern.

Refs:


🟠 2) SPECS.md:149-178 Variable bag on conversations table creates schema coupling

Issue: The spec proposes adding a variable_bag jsonb column directly to the conversations table. This embeds hook-specific state into a core runtime table, creating tight coupling between the hooks feature and the fundamental conversation model.

Why: This is a one-way-door schema decision. If variable bag semantics evolve (versioning, TTL, size limits, audit logging), modifying an embedded column is harder than evolving a separate table. The v2 consideration for task-scoped bags would require duplicating this column on tasks.

Fix: Consider a separate conversation_variable_bags table with FK to conversations. This keeps the conversations schema clean and provides a clear pattern when task-scoped bags are added.


🟠 3) SPECS.md:182-206 Artifact store extensions lack migration strategy

Issue: The spec proposes extending artifacts with hook-initiated writes using caller-chosen keys and a source: 'hook' marker, but doesn't address how this coexists with the existing ledgerArtifacts structure (where toolCallId is a primary lookup dimension and there's a unique constraint on taskId + contextId + name).

Why: Questions need answers: What is toolCallId for hook-written artifacts? How does the unique constraint on name interact with caller-chosen keys? Can hooks and tools write to the same key?

Fix: Specify exact column mappings for hook-written artifacts: Is toolCallId null? Is the caller-chosen key stored in name? Add the source column to the schema description (currently only mentioned in prose). Consider whether hook keys should be namespaced (e.g., hook:{key}) to avoid collisions.


🟠 4) SPECS.md:30-33 Asymmetric capabilities across hook points may confuse authors

Issue: before-args supports three actions (proceed, block, abort), while after-result and before-delegation only support two (proceed, abort). The block action (inject synthetic result, let LLM continue) is only available in before-args. This asymmetry is not explained.

Why: Authors wanting to do result-replacement in after-result might expect a block action. The current spec allows transformation via { action: 'proceed'; result: ... } but doesn't clarify why block doesn't exist there.

Fix: Add a brief rationale note explaining the asymmetry: block makes sense for before-args because the tool hasn't run yet, whereas after-result already has a result to transform.

Inline Comments:

  • 🟠 Major: USECASES.md:89 Sandbox contradiction with SPECS.md
  • 🟠 Major: USECASES.md:47 Incorrect return shape { proceed: false } vs { action: 'block' }
  • 🟠 Major: USECASES.md:84 Incorrect terminology proceed: false

🟡 Minor (3) 🟡

🟡 1) SPECS.md Filename should be SPEC.md (singular)

Issue: The main specification file is named SPECS.md (plural) while all other specs in the repository use SPEC.md (singular).

Why: Inconsistent naming makes discovery patterns less reliable.

Fix: Rename specs/2026-04-07-agent-hooks/SPECS.md to specs/2026-04-07-agent-hooks/SPEC.md.

Refs:


🟡 2) SPECS.md:1-10 Missing standard spec metadata header

Issue: The spec lacks the standard metadata header (Status, Created, Author, Links) that other specs include. This spec jumps directly into 'Overview' without metadata.

Why: Standard metadata headers enable tracking spec status, ownership, and related documents — especially important for a "work in progress" spec.

Fix: Add a metadata header:

# Hook Middleware — Technical Specification

**Status:** Work in Progress  
**Created:** 2026-04-07  
**Author:** [author]

---

🟡 3) SPECS.md:172-178 Variable bag write batching + abort semantics unclear

Issue: Writes are batched and flushed after tool call completes, but any hook can abort mid-chain. The spec doesn't specify whether writes from hooks that ran before the abort are persisted or discarded.

Why: Affects hook authors writing policy gates. If a budget-guard hook increments a counter and then aborts, should that counter persist? Both behaviors are defensible but lead to different correctness properties.

Fix: Specify abort semantics: either (1) all writes up to and including the aborting hook are persisted, or (2) on abort, no writes from the current tool call are persisted.

Inline Comments:

  • 🟡 Minor: USECASES.md:93-94 Duplicate MCP table entry
  • 🟡 Minor: USECASES.md:6 Hook naming inconsistency (AfterToolCall vs after-result)

💭 Consider (1) 💭

💭 1) SPECS.md:58-77 Consider after-delegation hook point for symmetry

Issue: The spec defines before-delegation but no after-delegation for when a sub-agent returns control. The existing tool hook points follow a symmetric before/after pattern.

Why: Limits use cases like logging delegation round-trip time, validating sub-agent results, or transforming returned context. May lead to requests for this hook in the future.

Fix: Either add after-delegation in v1 for symmetry, or explicitly document why delegation doesn't need an after-hook (perhaps because the sub-agent's final message goes through normal message flow).


💡 APPROVE WITH SUGGESTIONS

Summary: This is a thoughtful spec for a valuable feature. The core hook model is well-designed, and the use cases demonstrate clear value. The main concerns are: (1) document consistency issues between SPECS.md and USECASES.md that would confuse implementers, (2) schema design decisions that diverge from existing patterns, and (3) some missing details about artifact/variable bag interactions. None of these are blockers for a WIP spec, but should be addressed before implementation begins.


Discarded (4)
Location Issue Reason Discarded
SPECS.md:209-225 ProjectMcpAccessor creates parallel MCP management layer Valid architectural observation but explicitly addressed in spec's Open Questions section — intentional design choice with documented rationale
SPECS.md:241-265 Interface naming 'Meta' suffix not used elsewhere Minor naming preference; 'Meta' is a reasonable choice for lightweight data-carrying types vs '*Context' for dependency-carrying types
SPECS.md:127 onError enum values consistent Not an issue — reviewer confirmed this is consistent
SPECS.md:157-159 Column naming snake_case Not an issue — reviewer confirmed this matches existing patterns
Reviewers (4)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-architecture 8 3 1 0 1 0 3
pr-review-product 6 1 0 0 2 0 0
pr-review-consistency 10 2 0 0 1 0 1
pr-review-docs 8 1 0 0 1 0 0
Total 32 7 1 0 5 0 4

Note: Multiple reviewers flagged the same issues (sandbox contradiction, proceed: false terminology, duplicate MCP entry) — findings were deduplicated and consolidated.

| External call capability | Hooks can make outbound network requests (HTTP, vault lookups, internal APIs) during execution — not just transform data already in memory | Prefetch cases, credential injection |
| MCP tool access from hooks | Hooks can invoke tools through the MCP protocol using the MCP SDK, not just raw HTTP — enabling structured tool calls with schema validation and the full MCP tool ecosystem from within a hook | Google Calendar context fetcher, any prefetch case using MCP-registered tools |
| Flow-level exit / abort | A hook can terminate the entire generation loop — no further steps execute. This is not scoped to a single tool call or delegation; it fully stops the running agent. Can be triggered from any hook point (before-args, after-result, before-delegation). | Calendar-gate context fetcher (no meetings → stop entirely), budget exceeded, policy violation |
| Programmatic hook | Hooks are TypeScript functions registered at config time by a project superuser. They run in-process with full Node.js access — conditionals, external calls, loops, MCP tool invocations — not just declarative transforms. Hook authors are trusted; no sandbox. | Budget guard, deduplication, calendar exit gate, credential injection |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Sandbox contradiction with SPECS.md

Issue: This line states "Hook authors are trusted; no sandbox" but SPECS.md explicitly states hooks run via SandboxExecutorFactory with the same sandbox infrastructure as function tools.

Why: This is a critical security/architecture contradiction. Readers cannot determine the actual trust model and security posture of hooks.

Fix:

Suggested change
| Programmatic hook | Hooks are TypeScript functions registered at config time by a project superuser. They run in-process with full Node.js access — conditionals, external calls, loops, MCP tool invocations — not just declarative transforms. Hook authors are trusted; no sandbox. | Budget guard, deduplication, calendar exit gate, credential injection |
| Programmatic hook | Hooks are TypeScript functions registered at config time by a project superuser. They run in the same sandbox infrastructure as function tools (`SandboxExecutorFactory`, `node22` runtime) — conditionals, external calls, loops, MCP tool invocations — not just declarative transforms. | Budget guard, deduplication, calendar exit gate, credential injection |

Refs:

## Policy / Control Flow Gates

**Audit mode — block all write tools**
A user puts an agent in "audit mode." Any tool with a write side-effect (`create_issue`, `update_record`, `send_message`) is blocked by a hook. The agent can still read and reason; it just can't act. The hook returns `{ proceed: false, reason: "agent is in read-only mode" }` and the platform feeds that back as the tool result.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Incorrect return shape contradicts SPECS.md

Issue: Uses { proceed: false, reason: ... } but SPECS.md defines the action vocabulary as { action: 'block'; result: string }. The field names and structure don't match the actual API contract.

Why: Hook authors following this use case would implement the wrong return shape, causing runtime failures.

Fix:

Suggested change
A user puts an agent in "audit mode." Any tool with a write side-effect (`create_issue`, `update_record`, `send_message`) is blocked by a hook. The agent can still read and reason; it just can't act. The hook returns `{ proceed: false, reason: "agent is in read-only mode" }` and the platform feeds that back as the tool result.
A user puts an agent in "audit mode." Any tool with a write side-effect (`create_issue`, `update_record`, `send_message`) is blocked by a hook. The agent can still read and reason; it just can't act. The hook returns `{ action: 'block', result: 'agent is in read-only mode' }` and the platform feeds that back as the tool result.

Refs:

  • SPECS.md:30-33 — defines BeforeArgsResult with action discriminator

| Before-args hook point | Fires after the LLM generates a tool call but before the tool executes; can inspect and mutate arguments or block execution entirely | Credential injection, tenant scoping, timestamp normalization, path scope |
| After-result hook point | Fires after the tool returns its result but before that result enters the LLM's context window; can transform or replace the output | All result-too-big, all output routing |
| Before-delegation hook point | Fires when a supervisor is about to hand off to a sub-agent, before the sub-agent receives any input; can modify what the sub-agent starts with | All sub-agent prefetch cases |
| Result replacement (`proceed: false`) | The platform contract by which a hook short-circuits tool execution and injects an arbitrary string back to the LLM as if it were the tool's result | All gate cases, output routing short confirmations |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Incorrect terminology proceed: false

Issue: Uses "Result replacement (proceed: false)" but SPECS.md defines this as { action: 'block' }. Same issue as line 47.

Why: Consistent terminology is critical for implementers and users to understand the API contract.

Fix:

Suggested change
| Result replacement (`proceed: false`) | The platform contract by which a hook short-circuits tool execution and injects an arbitrary string back to the LLM as if it were the tool's result | All gate cases, output routing short confirmations |
| Result replacement (`action: 'block'`) | The platform contract by which a hook short-circuits tool execution and injects an arbitrary string back to the LLM as if it were the tool's result | All gate cases, output routing short confirmations |

Refs:

Comment on lines +93 to +94
| Variable bag | A mutable key-value store scoped to the conversation, readable and writable by any hook, invisible to the LLM. Stored as a `jsonb` column on the `conversations` table. Task-scoped variable bags are a v2 consideration for cases where state should not bleed between turns. | Budget guard, deduplication, tenant ID propagation |
| MCP tool access from hooks | Hooks have access to all MCP tools configured on the project (not scoped to the invoking agent). The hook author is a project superuser and can invoke any registered MCP tool. Requires a project-level MCP accessor distinct from the per-agent `AgentMcpManager`. | Google Calendar context fetcher, any prefetch case using MCP-registered tools |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Duplicate table entry

Issue: "MCP tool access from hooks" appears twice in this table (also at line 87) with slightly different wording.

Why: Duplication creates noise and suggests incomplete editing. Readers may wonder if these are different capabilities.

Fix:

Suggested change
| Variable bag | A mutable key-value store scoped to the conversation, readable and writable by any hook, invisible to the LLM. Stored as a `jsonb` column on the `conversations` table. Task-scoped variable bags are a v2 consideration for cases where state should not bleed between turns. | Budget guard, deduplication, tenant ID propagation |
| MCP tool access from hooks | Hooks have access to all MCP tools configured on the project (not scoped to the invoking agent). The hook author is a project superuser and can invoke any registered MCP tool. Requires a project-level MCP accessor distinct from the per-agent `AgentMcpManager`. | Google Calendar context fetcher, any prefetch case using MCP-registered tools |

Refs:

## Result Too Big

**PR diff → filenames + line counts**
A tool returns a full PR diff, 40k tokens. The agent only needs changed file names and line counts to decide what to read next. Hook: `AfterToolCall(get_pull_request_diff)` → JMESpath `files[*].{name: filename, additions: additions, deletions: deletions}`. Context impact: 40k → ~200 tokens.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Hook naming inconsistency

Issue: Uses AfterToolCall(get_pull_request_diff) but SPECS.md names the hook point after-result, not AfterToolCall.

Why: Readers searching for AfterToolCall won't find it in SPECS.md, causing confusion about whether this is a different concept.

Fix:

Suggested change
A tool returns a full PR diff, 40k tokens. The agent only needs changed file names and line counts to decide what to read next. Hook: `AfterToolCall(get_pull_request_diff)` → JMESpath `files[*].{name: filename, additions: additions, deletions: deletions}`. Context impact: 40k → ~200 tokens.
A tool returns a full PR diff, 40k tokens. The agent only needs changed file names and line counts to decide what to read next. Hook: `after-result` for `get_pull_request_diff` → JMESpath `files[*].{name: filename, additions: additions, deletions: deletions}`. Context impact: 40k → ~200 tokens.

Refs:

@github-actions github-actions bot deleted a comment from claude bot Apr 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had recent activity.
It will be closed in 7 days if no further activity occurs.

If this PR is still relevant:

  • Rebase it on the latest main branch
  • Add a comment explaining its current status
  • Request a review if it's ready

Thank you for your contributions!

@github-actions github-actions bot added the stale label Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant