Skip to content

Add $select filtering and ref-aware schemas for tool chaining#3056

Merged
tim-inkeep merged 8 commits intomainfrom
feature/tool-chaining-selector
Apr 9, 2026
Merged

Add $select filtering and ref-aware schemas for tool chaining#3056
tim-inkeep merged 8 commits intomainfrom
feature/tool-chaining-selector

Conversation

@tim-inkeep
Copy link
Copy Markdown
Contributor

Summary

  • $select JMESPath filtering: Tool chaining references ($tool, $artifact) now support a $select field that filters data at resolution time using JMESPath expressions, enabling extraction of specific fields before they reach the
    downstream tool
  • Ref-aware tool schemas: Every tool parameter's JSON schema is now wrapped in anyOf: [originalType, refObject] so the LLM structurally sees that { "$tool": "..." } objects are valid inputs — not just prompt-guided. Includes a
    baseInputSchema for post-resolution validation against the original types
  • Stronger tool chaining prompts: Rewritten guidance with decision tree, numbered reference types, mandatory language, and reinforced schema-aware framing
  • Internal field sanitization: _structureHints and _toolCallId stripped from OTEL trace spans via ToolResultSanitizer SpanProcessor; stripInternalFields utility shared across ArtifactService and trace output
  • Better error messages: When $select resolves to the wrong type (e.g., object where string expected), the error now tells the LLM exactly what happened and suggests drilling deeper with specific field paths

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 7, 2026

🦋 Changeset detected

Latest commit: a31de87

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 7, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Apr 8, 2026 8:52pm
agents-docs Ready Ready Preview, Comment Apr 8, 2026 8:52pm
agents-manage-ui Ready Ready Preview, Comment Apr 8, 2026 8:52pm

Request Review

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 7, 2026

TL;DR — Adds a $select sentinel key that lets agents pass a JMESPath expression alongside tool-chaining references ($tool, $artifact) to extract specific fields before the next tool executes, eliminating the need for intermediate extraction steps. Tool input schemas are now rewritten at registration time to accept ref objects on every parameter, and prompt guidance is consolidated around a decision-tree format that steers the LLM toward correct chaining behavior.

Key changes

  • New $select JMESPath filtering — tool-chaining references now accept an optional $select field that applies a JMESPath expression to resolved data before passing it downstream, replacing the multi-step extract-then-chain pattern.
  • Ref-aware schema transformation — a new makeRefAwareJsonSchema function rewrites every tool parameter's JSON Schema to accept { $tool, $select } or { $artifact, $tool, $select } ref objects via anyOf, so the LLM sees chaining as a first-class option in the schema itself.
  • Dual-schema validation — tools now carry a baseInputSchema alongside the ref-aware inputSchema; post-resolution validation runs against the original schema so ref sentinels don't cause false validation failures.
  • Rewritten tool-chaining prompt guidance — the verbose multi-paragraph chaining instructions in PromptConfig are replaced with a compact decision tree, and all artifact/tool descriptions consistently steer toward chaining over get_reference_artifact.
  • select-filter utility with security hardening — a new module provides applySelector (JMESPath with sanitization, caching, length limits, and injection pattern detection) and stripInternalFields (removes _-prefixed keys from tool results).
  • Internal field stripping in traces and session events_structureHints and _toolCallId are stripped from ai.toolCall.result span attributes via a new ToolResultSanitizer span processor, and from session event recordings.
  • Prompt templates inlined — XML prompt templates previously imported from separate files are inlined as string constants in PromptConfig.ts.

Summary | 18 files | 2 commits | base: mainfeature/tool-chaining-selector


$select filtering and applySelector utility

Before: Extracting a specific field from a prior tool result required an intermediate extraction tool call; the only chaining sentinels were $tool and $artifact.
After: A $select JMESPath expression can be added to any chaining reference ({ "$tool": "call_id", "$select": "items[0].name" }), and the system resolves it server-side before the downstream tool executes.

The new applySelector function sanitizes LLM-generated JMESPath (fixing double-quote comparisons, tilde-contains patterns), validates against injection patterns and length limits, auto-strips the result. prefix for consistency with _structureHints, and throws ToolChainResolutionError with a shape summary when the expression matches nothing.

select-filter.ts · artifact-syntax.ts · ArtifactParser.ts · ArtifactService.ts


Ref-aware schema transformation

Before: Tool schemas only accepted literal values; the LLM had to rely purely on prompt instructions to know it could pass chaining objects.
After: Every tool parameter's JSON Schema is rewritten with anyOf alternatives that explicitly accept { $artifact, $tool, $select } and { $tool, $select } ref objects, making chaining structurally valid at the schema level.

makeRefAwareJsonSchema recursively walks the schema tree and wraps each value-position node with a withToolRef union. A separate makeBaseInputSchema preserves the original schema for post-resolution validation so that ref objects don't fail the base type check. Both function tools and MCP tools use this transformation at registration time.

ref-aware-schema.ts · function-tools.ts · mcp-tools.ts · agent-types.ts


Post-resolution validation and error messages

Before: Post-resolution validation used the ref-aware parameters schema, which could false-positive on resolved data; error messages were generic.
After: Validation runs against baseInputSchema (the original schema) and error messages explain the likely $select type mismatch with actionable guidance to drill deeper into _structureHints.terminalPaths.

tool-wrapper.ts


Prompt and guidance rewrite

Before: Tool chaining instructions were verbose, example-heavy prose spanning ~70 lines with separate extract-then-chain workflow guidance.
After: Instructions use a compact decision tree format, $select is integrated into every reference type, and all artifact schema descriptions consistently discourage get_reference_artifact for data piping in favor of tool chaining.

System prompt templates are inlined as string constants, the get_reference_artifact description steers users away from using it for data transfer, and toolChainingGuidance is added to _structureHints in tool results with ready-to-use reference patterns.

PromptConfig.ts · default-tools.ts · tool-result.ts


Internal field stripping in traces and session events

Before: _structureHints and _toolCallId leaked into OTEL span attributes and session event recordings, bloating traces.
After: A ToolResultSanitizer span processor strips _-prefixed keys from ai.toolCall.result attributes, and stripInternalFields is applied to session event outputs.

instrumentation.ts · tool-wrapper.ts


Tests

New test suites cover ref-aware schema transformation (152 lines), $select resolution through ArtifactParser.resolveArgs (190 lines), and the applySelector/sanitizeJMESPathSelector utility functions (148 lines). Existing PromptConfig artifact schema tests are updated for the "TOOL CHAINING" terminology.

ref-aware-schema.test.ts · resolveArgs-select.test.ts · select-filter.test.ts · PromptConfig.artifactSchema.test.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog bot commented Apr 7, 2026

TL;DR — Adds $select (JMESPath) filtering to tool chaining so agents can extract specific fields when piping data between tools, instead of passing entire payloads or requiring intermediate extraction steps. Tool input schemas are now rewritten at registration time to structurally accept { "$tool": "...", "$select": "..." } reference objects on every parameter, and post-resolution validation uses the original schema to produce actionable error messages when a $select resolves to the wrong type.

Key changes

  • Add $select sentinel and applySelector utility — new JMESPath-based field extraction at tool-chaining resolution time, with security hardening (length limits, injection pattern blocklist, selector sanitization)
  • Add ref-aware-schema.ts for schema transformation — rewrites every tool parameter's JSON Schema into an anyOf union that accepts either the original type or a { $tool, $artifact, $select } reference object
  • Dual-schema validation with baseInputSchema — tools now carry both the ref-aware schema (for the LLM) and the original base schema (for post-resolution validation), with descriptive error messages guiding $select path corrections
  • Rewrite tool chaining prompt guidance — replaces verbose prose with a compact decision-tree format, adds $select JMESPath pattern examples, and steers the LLM away from get_reference_artifact for tool-to-tool data flow
  • Strip internal fields from traces and events — new ToolResultSanitizer span processor and stripInternalFields utility remove _structureHints/_toolCallId from telemetry attributes and stored results
  • Inject toolChainingGuidance into _structureHints — every tool result now includes inline guidance with example $select patterns, sequencing rules, and forbidden-action reminders directly in the structure hints
  • Centralize sentinel constants in artifact-syntax.ts — extracts $artifact, $tool, $select, artifact tag names, and tool names into typed constants used across prompts, schemas, and resolution code
  • Export DANGEROUS_PATTERNS from agents-core — re-exports the JMESPath injection blocklist from signature-validation.ts so select-filter.ts can share the same pattern set
  • Add four test suitesref-aware-schema.test.ts, resolveArgs-select.test.ts, select-filter.test.ts, and PromptConfig.artifactSchema.test.ts covering schema transformation, end-to-end resolution, JMESPath filtering with security edge cases, and artifact schema rendering

Summary | 22 files | 8 commits | base: mainfeature/tool-chaining-selector


JMESPath $select filtering for tool chaining

Before: Passing a subset of a tool result to the next tool required an intermediate extraction step — the agent had to call an extract tool, then reference that call's output.
After: Agents pass { "$tool": "call_id", "$select": "items[0].title" } directly and the system resolves the JMESPath expression at resolution time, before the downstream tool executes.

applySelector handles the full lifecycle: strips the result. prefix that _structureHints selectors include, sanitizes common LLM JMESPath syntax errors (double-quoted comparisons, tilde-contains), validates against injection patterns and length limits, and returns an actionable ToolChainResolutionError with a summarizeShape preview when the expression matches nothing.

What security hardening does the selector have?

Expressions are checked against a shared DANGEROUS_PATTERNS blocklist (re-exported from agents-core/utils/signature-validation) covering __proto__, constructor, eval(), and template literal injection. They are also capped at 1,000 characters. A module-level selector cache (bounded at 1,000 entries) avoids repeated sanitization of the same expression.

select-filter.ts · ArtifactParser.ts · artifact-syntax.ts · signature-validation.ts


Ref-aware schema transformation

Before: Tool input schemas accepted only literal values; the LLM had to rely on prompt instructions alone to know it could pass { "$tool": "..." } reference objects.
After: Every tool parameter's JSON Schema is rewritten into an anyOf [originalType, refObject] union at registration time, so the model sees tool chaining as a first-class option in the schema itself.

makeRefAwareJsonSchema recursively walks JSON Schema nodes — properties, items, additionalProperties, patternProperties, and conditional keywords (if/then/else) — and wraps each value-position node with withToolRef. The ref union includes both artifact refs ($artifact + $tool + optional $select) and tool-only refs ($tool + optional $select), each with structured descriptions from TOOL_CHAINING_SCHEMA_DESCRIPTIONS.

Schema construction is wrapped in try/catch at both the function-tool and MCP-tool registration sites — if ref-aware schema generation fails, the tool falls back to the original schema and logs a warning.

ref-aware-schema.ts · function-tools.ts · mcp-tools.ts


Dual-schema post-resolution validation

Before: After resolving $tool/$artifact references, the resolved args were validated against the same ref-aware schema the LLM used — which always passed because the anyOf accepted any shape.
After: Tools carry a baseInputSchema (the original, un-transformed schema) used exclusively for post-resolution validation. Error messages now include per-key type diagnostics and suggest drilling deeper in the $select path.

The baseInputSchema is built via makeBaseInputSchema (a thin z.fromJSONSchema wrapper) and attached to the tool definition via Object.assign. tool-wrapper.ts picks baseInputSchema over parameters for validation when present, and guards against non-object resolvedArgs when building mismatch diagnostics.

tool-wrapper.ts · agent-types.ts


Prompt and guidance rewrite

Before: Tool chaining guidance was ~70 lines of verbose prose with multiple wrong/right examples and an intermediate extraction step workflow.
After: Guidance is a compact decision tree ("Does the data come from a prior tool call? → Is it an artifact? → Do you need the full output? → Do you need a specific field?") with concise reference-type descriptions and JMESPath pattern examples.

The system prompt, tool template, artifact template, and retrieval guidance consistently steer the LLM away from get_reference_artifact for tool-to-tool data flow and toward $select for field extraction. toolChainingGuidance is also injected into _structureHints on tool results via tool-result.ts, giving the LLM inline reminders with ready-to-use $select patterns at every tool call boundary.

PromptConfig.ts · tool-result.ts · default-tools.ts


Internal field stripping from traces and events

Before: _structureHints and _toolCallId appeared in telemetry span attributes, stored tool results, and streamed events — bloating traces and leaking internal context.
After: A ToolResultSanitizer span processor strips underscore-prefixed keys from ai.toolCall.result attributes, and stripInternalFields is applied in ArtifactService.getToolResultFull, tool-wrapper event recording, and telemetry.

ArtifactService now routes all tool result access through a shared getToolResultFull method that applies stripInternalFields once, ensuring createArtifact, getToolResultRaw, and $select resolution all operate on the same cleaned data.

instrumentation.ts · ArtifactService.ts · tool-wrapper.ts

Pullfrog  | View workflow run | Triggered by Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid feature — the $select JMESPath filtering, ref-aware schemas, and security validation are well-designed. Two issues warrant attention before merging: the dead template files that should be deleted, and the duplicated sanitizeJMESPathSelector + JMESPath usage in ArtifactService that the new select-filter.ts was explicitly extracted to replace. One minor runtime safety issue in the error path.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏

import dataComponentTemplate from '../../../../../../templates/v1/shared/data-component.xml?raw';
import dataComponentsTemplate from '../../../../../../templates/v1/shared/data-components.xml?raw';

const systemPromptTemplate =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The six XML template files under agents-api/templates/v1/ are now dead code — they're no longer imported from anywhere. The inlined constants here contain updated content (tool chaining guidance, $select references) that diverges from the on-disk files. Delete the templates/v1/ directory so there's no stale source of confusion.

* Sanitize JMESPath selector to fix common LLM syntax issues.
* Extracted from ArtifactService for reuse in $select filtering.
*/
export function sanitizeJMESPathSelector(selector: string): string {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArtifactService still has its own private sanitizeJMESPathSelector (line 854) with identical logic and a separate static selectorCache. The comment on line 31 here says "Extracted from ArtifactService for reuse" but the original was never removed. This means two independent caches, two copies of the same four regexes, and — critically — ArtifactService's direct jmespath.search() calls (lines 225, 1048) bypass the security validation (DANGEROUS_PATTERNS, MAX_EXPRESSION_LENGTH) that applySelector enforces.

Consolidate: have ArtifactService import sanitizeJMESPathSelector (and ideally applySelector) from this module, then remove its private method and static cache.

Comment on lines +200 to +201
if (!validation.success) {
const mismatchDetails = Object.entries(resolvedArgs as Record<string, unknown>)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object.entries(resolvedArgs as Record<string, unknown>) is unsafe if resolvedArgs is a primitive. When the entire args object is a single $tool + $select ref (e.g., { "$tool": "call_x", "$select": "items[0].name" }), resolveArgs returns the selected value directly — which could be a string, number, or array. Calling Object.entries on a string iterates its character indices, producing misleading error output.

Guard with a type check:

const mismatchDetails = resolvedArgs && typeof resolvedArgs === 'object' && !Array.isArray(resolvedArgs)
  ? Object.entries(resolvedArgs as Record<string, unknown>)
      .map(([key, val]) => { ... })
      .join(', ')
  : `resolved to ${Array.isArray(resolvedArgs) ? 'array' : typeof resolvedArgs}`;

baseInputSchema: ReturnType<typeof z.fromJSONSchema> | undefined;
} {
try {
const rawJson = z.toJSONSchema(inputSchema as z.ZodType) as Record<string, unknown>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z.toJSONSchema(inputSchema as z.ZodType) assumes inputSchema is a Zod schema, but MCP tool schemas from the protocol are plain JSON Schema objects — they're only converted to Zod via z.fromJSONSchema at registration time. If inputSchema is already a Zod type this works, but if it's a raw JSON Schema object, z.toJSONSchema will fail or produce incorrect output.

The function-tools path (function-tools.ts:89-106) correctly uses functionData.inputSchema (raw JSON Schema) directly with makeRefAwareJsonSchema. Verify this path handles both Zod and raw JSON Schema inputs, or add a type guard.

}
}
if (changed) {
(span as any).attributes['ai.toolCall.result'] = JSON.stringify(parsed);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutating span.attributes via (span as any).attributes[...] after onEnd relies on the span object being mutable at this point. ReadableSpan is — by design — read-only. This works today because the JS object is still writable, but it's fragile: future OTEL SDK versions could freeze the attributes or use a different internal representation. Consider whether this sanitization should happen earlier (e.g., in a custom SpanExporter wrapper or before the span is finalized) for long-term stability.

description: TOOL_CHAINING_SCHEMA_DESCRIPTIONS.SELECT,
},
},
additionalProperties: true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

additionalProperties: true on both ref schema variants means the LLM can include arbitrary extra keys alongside $tool/$artifact/$select without validation failure. This is intentional (the LLM might include other tool params in the same object), but it also means typos in sentinel keys (e.g., $selec) will silently pass validation. Just confirming this is the desired tradeoff — if so, no action needed.

@github-actions github-actions bot deleted a comment from claude bot Apr 7, 2026
@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 7, 2026

Ito Test Report ✅

8 test cases ran. 1 additional finding, 7 passed.

Overall, the unified QA run covered 8 verifiable test cases with 7 passes and 1 failure (plus one additional code-review-only check with no customer-facing pass/fail assertions). The strongest findings were that tool-chaining selector behavior is largely correct and robust (primitive extraction via $tool+$select, stripping internal underscore fields, expected ToolChainResolutionError handling for no-match/malformed/type-mismatch selectors, stable rapid re-submits with distinct chat IDs despite provider credential errors, and rejection of dangerous proto selectors), but a high-severity defect remains where POST /run/api/chat can throw a 500 “Context validation failed” when agent context is unresolved because validation dereferences a missing agent before selector sanitization.

✅ Passed (7)
Category Summary Screenshot
Adversarial Dangerous selector token __proto__ is correctly rejected by selector validation logic. ADV-1
Edge Confirmed expected no-match $select behavior raises ToolChainResolutionError with actionable guidance. N/A
Edge Confirmed malformed JMESPath selectors are deterministically converted to ToolChainResolutionError. N/A
Edge Confirmed post-resolution type mismatch handling rejects invalid resolved values with targeted remediation guidance. N/A
Edge Rapid concurrent submits returned isolated HTTP 200 responses with distinct chat IDs; failures were provider-auth errors, not chaining-state corruption. EDGE-5
Logic Code inspection and targeted verification confirm $tool + $select resolves primitive values correctly for strict downstream parameters. LOGIC-1
Logic Confirmed non-bug: underscore-prefixed chaining internals are stripped before downstream resolution; targeted regression suite passed. LOGIC-4
ℹ️ Additional Findings (1)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Adversarial ⚠️ run/api/chat returns 500 before selector sanitization when agent context is unresolved. ADV-2
⚠️ run/api/chat returns 500 before selector sanitization when agent context is unresolved
  • What failed: The request fails with 500 Context validation failed before selector sanitization can execute; expected behavior is a deterministic validation response (or a clean not-found/bad-request) instead of an internal error.
  • Impact: Requests can fail with internal errors on mis-resolved agent contexts, blocking security-path tests and causing unstable API behavior. Users and automation receive non-actionable 500 responses instead of actionable validation/not-found errors.
  • Steps to reproduce:
    1. Call POST /run/api/chat with tenant/project/agent headers where the target agent context is not resolvable.
    2. Submit a prompt that includes chained selector payloads such as ${evil_payload} or eval().
    3. Observe that the response fails with 500 Context validation failed before selector sanitization behavior is evaluated.
  • Stub / mock context: The run used non-production bypass authentication and also exercised fallback agent/header combinations while probing the route. No synthetic success responses were injected, so the observed 500 reflects server-side request handling under that setup.
  • Code analysis: I reviewed run-route and context-validation flow. Context validation runs before route-level agent checks, and validateHeaders dereferences agent.contextConfig without guarding for a missing agent, which can throw and be transformed into a generic internal 500.
  • Why this is likely a bug: A missing-agent path can trigger an avoidable null/undefined dereference in production request handling, producing an internal 500 instead of a controlled client error.

Relevant code:

agents-api/src/domains/run/routes/chatDataStream.ts (lines 107-110)

// Apply context validation middleware
app.use('/chat', contextValidationMiddleware);

app.openapi(chatDataStreamRoute, async (c) => {

agents-api/src/domains/run/context/validation.ts (lines 284-287)

const agent = project.agents[agentId];

const contextConfig = agent.contextConfig;
logger.info({ contextConfig }, 'Context config found');

agents-api/src/domains/run/context/validation.ts (lines 436-446)

} catch (error) {
  logger.error(
    {
      error: error instanceof Error ? error.message : 'Unknown error',
    },
    'Context validation middleware error'
  );
  throw createApiError({
    code: 'internal_server_error',
    message: 'Context validation failed',
  });
}

Commit: 485de90

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(6) Total Issues | Risk: Medium

🟠⚠️ Major (4) 🟠⚠️

Inline Comments:

  • 🟠 Major: select-filter.ts:9-18 JSDoc comment misplaced — describes stripInternalFields but precedes clearSelectorCache
  • 🟠 Major: select-filter.ts:24-31 DANGEROUS_PATTERNS duplicated from agents-core — should import from @inkeep/agents-core/utils/jmespath-utils
  • 🟠 Major: function-tools.ts:89-94 Silent schema building failure — no logging when baseInputSchema creation fails
  • 🟠 Major: mcp-tools.ts:28-33 Silent schema transformation failure — catch block swallows all errors with no logging

🟡 Minor (2) 🟡

Inline Comments:

  • 🟡 Minor: instrumentation.ts:100 Empty catch block silently swallows JSON parsing errors
  • 🟡 Minor: instrumentation.ts:97 Mutating ReadableSpan attributes violates OpenTelemetry contract

💭 Consider (3) 💭

💭 1) select-filter.ts Missing unit tests for stripInternalFields

Issue: The stripInternalFields function (lines 13-18) has no unit tests despite being used in critical paths (tool-wrapper.ts, ArtifactService.ts).
Why: Regressions could cause internal fields like _structureHints to leak into stored tool results or traces.
Fix: Add tests to select-filter.test.ts:

describe('stripInternalFields', () => {
  it('should remove underscore-prefixed keys', () => {
    expect(stripInternalFields({ _internal: 1, visible: 2 })).toEqual({ visible: 2 });
  });
  it('should return non-objects unchanged', () => {
    expect(stripInternalFields('string')).toBe('string');
    expect(stripInternalFields([1, 2])).toEqual([1, 2]);
  });
});

💭 2) resolveArgs-select.test.ts Test mocks getToolResultRaw but implementation calls getToolResultFull for $select

Issue: Tests at line 46 mock getToolResultRaw for $select scenarios, but the implementation (ArtifactParser.ts:259-262) calls getToolResultFull when $select is present. The test passes only because the helper falls back.
Why: This masks potential bugs if the two methods diverge.
Fix: Explicitly mock getToolResultFull when testing $select scenarios.

💭 3) tool-wrapper.ts:207-228 Missing integration test for post-resolution validation error path

Issue: The enhanced error message for type mismatches after $select resolution has no test coverage.
Why: This is a key user-facing error path for LLM self-correction.

🧹 While You're Here (1) 🧹

🧹 1) agents-api/templates/v1/ Dead template files should be deleted

Issue: Six XML template files still exist in agents-api/templates/v1/ but are no longer imported anywhere — the templates were inlined as string constants in PromptConfig.ts.
Why: These are now dead code that could cause confusion. Grep confirms no TypeScript files import from templates/v1.
Fix: Delete the agents-api/templates/v1/ directory.
Refs: Pullfrog review comment


💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-designed feature that adds $select JMESPath filtering and ref-aware schemas for tool chaining. The core implementation is solid with good security validation and test coverage. The main actionable items are: (1) add logging to the silent catch blocks in schema building to aid debugging, (2) import DANGEROUS_PATTERNS from agents-core instead of duplicating, and (3) clean up the dead template files. None of these are blocking, but addressing the logging gaps would significantly improve debuggability.

Discarded (5)
Location Issue Reason Discarded
mcp-tools.ts:24 z.toJSONSchema cast may fail for non-Zod schemas Handled correctly by try-catch fallback; the catch block returns original schema
instrumentation.ts:81 onStart param should be Span not ReadableSpan No runtime impact — method body is empty
agent-types.ts:260 baseInputSchema type duplicates parameters type shape Minor DRY violation but localized and simple
select-filter.ts:133 Error message may leak data structure info Intentional for LLM feedback; acceptable for use case
ref-aware-schema.ts:56 additionalProperties: true allows typos in sentinel keys Intentional tradeoff confirmed — allows other tool params in same object
Reviewers (5)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 3 0 0 0 2 0 1
pr-review-appsec 4 0 0 0 2 0 2
pr-review-tests 6 0 3 0 0 0 0
pr-review-errors 5 0 0 0 2 0 2
pr-review-types 5 0 0 0 0 0 5
Total 23 0 3 1 6 0 10

Note: Many type-related findings were discarded as they are either handled by existing fallbacks or are minor expressiveness issues with no runtime impact.

Comment on lines +9 to +18
export function clearSelectorCache(): void {
selectorCache.clear();
}

export function stripInternalFields<T>(data: T): T {
if (data && typeof data === 'object' && !Array.isArray(data)) {
return Object.fromEntries(Object.entries(data).filter(([key]) => !key.startsWith('_'))) as T;
}
return data;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: JSDoc comment misplaced — describes wrong function

Issue: The JSDoc block at lines 4-8 describes "Strip internal fields (prefixed with _) from a tool result object" but immediately precedes clearSelectorCache(), not stripInternalFields().

Why: This creates confusing documentation — developers reading this code will think clearSelectorCache strips internal fields when it actually clears the selector cache. The actual stripInternalFields function at line 13 has no documentation.

Fix:

Suggested change
export function clearSelectorCache(): void {
selectorCache.clear();
}
export function stripInternalFields<T>(data: T): T {
if (data && typeof data === 'object' && !Array.isArray(data)) {
return Object.fromEntries(Object.entries(data).filter(([key]) => !key.startsWith('_'))) as T;
}
return data;
}
export function clearSelectorCache(): void {
selectorCache.clear();
}
/**
* Strip internal fields (prefixed with _) from a tool result object.
* Removes _structureHints, _toolCallId, etc. that are added for LLM context
* but should not appear in traces, stored results, or downstream data.
*/
export function stripInternalFields<T>(data: T): T {

Refs:

Comment on lines +24 to +31
const DANGEROUS_PATTERNS: RegExp[] = [
/\$\{.*\}/,
/eval\s*\(/,
/function\s*\(/,
/constructor/,
/prototype/,
/__proto__/,
];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: DANGEROUS_PATTERNS duplicated from agents-core

Issue: This array duplicates the exact same patterns defined in packages/agents-core/src/utils/jmespath-utils.ts (lines 191-198), which is already exported.

Why: Maintaining two copies of security-critical patterns creates drift risk. If one is updated and the other isn't, security gaps could emerge. The core package also exports validateJMESPathSecure() which could be used directly.

Fix: Import from agents-core instead of redefining:

import { DANGEROUS_PATTERNS } from '@inkeep/agents-core/utils/jmespath-utils';

Or consider using validateJMESPathSecure() from the same module which combines length + pattern + compile validation.

Refs:

Comment on lines +89 to +94
if (functionData.inputSchema) {
try {
baseInputSchema = makeBaseInputSchema(functionData.inputSchema);
} catch {
// baseInputSchema stays undefined — post-resolution validation will be skipped
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Silent schema building failure — no logging when baseInputSchema creation fails

Issue: When makeBaseInputSchema() throws, the error is silently swallowed. Post-resolution validation is silently disabled for this tool with no indication to operators.

Why: If schema parsing fails for a tool, that tool will accept any arguments after reference resolution, potentially causing downstream failures that are harder to debug. Operators have no visibility into which tools have disabled validation.

Fix: Add a warning log so operators know which tools have disabled post-resolution validation:

Suggested change
if (functionData.inputSchema) {
try {
baseInputSchema = makeBaseInputSchema(functionData.inputSchema);
} catch {
// baseInputSchema stays undefined — post-resolution validation will be skipped
}
try {
baseInputSchema = makeBaseInputSchema(functionData.inputSchema);
} catch (schemaError) {
logger.warn(
{
functionToolName: functionToolDef.name,
schemaError: schemaError instanceof Error ? schemaError.message : String(schemaError),
},
'Failed to build base input schema; post-resolution validation will be skipped'
);
}

Refs:

Comment on lines +28 to +33
} catch {
return {
refAwareInputSchema: inputSchema as ReturnType<typeof z.fromJSONSchema>,
baseInputSchema: undefined,
};
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Silent schema transformation failure — catch block swallows all errors

Issue: When z.toJSONSchema(), makeBaseInputSchema(), or makeRefAwareJsonSchema() throws, the error is silently swallowed with no logging. The tool loses ref-aware schema capabilities and post-resolution validation with no indication.

Why: This could hide type mismatches, invalid schemas, or other issues. MCP tools failing schema transformation will behave differently than expected with no trace or log to diagnose.

Fix: Add logging to indicate when schema transformation fails:

Suggested change
} catch {
return {
refAwareInputSchema: inputSchema as ReturnType<typeof z.fromJSONSchema>,
baseInputSchema: undefined,
};
}
} catch (schemaError) {
logger.warn(
{
schemaError: schemaError instanceof Error ? schemaError.message : String(schemaError),
},
'Failed to build ref-aware schema for MCP tool; falling back to original schema'
);
return {
refAwareInputSchema: inputSchema as ReturnType<typeof z.fromJSONSchema>,
baseInputSchema: undefined,
};
}

Refs:

Comment thread agents-api/src/instrumentation.ts Outdated
(span as any).attributes['ai.toolCall.result'] = JSON.stringify(parsed);
}
}
} catch {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Empty catch block silently swallows JSON parsing errors

Issue: When span.attributes['ai.toolCall.result'] contains malformed JSON, the catch block silently swallows the error with no logging.

Why: Developers debugging trace issues have no visibility into parsing failures. The span processor appears to work correctly when it may be silently failing on certain spans.

Fix: Add a comment acknowledging this is expected behavior, or add debug-level logging:

Suggested change
} catch {}
} catch {
// Expected for non-JSON tool results - leave unchanged
}

Refs:

}
}
if (changed) {
(span as any).attributes['ai.toolCall.result'] = JSON.stringify(parsed);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Mutating ReadableSpan attributes violates OpenTelemetry contract

Issue: ReadableSpan is designed to be read-only by the OpenTelemetry interface, but this code mutates its attributes via an as any cast.

Why: While this works in practice with the current SDK, it relies on implementation details rather than the public API. This could break if OpenTelemetry changes how spans are frozen. The type annotation on onStart also uses ReadableSpan when it should be Span.

Fix: Consider adding a comment acknowledging the contract violation and why it's necessary:

// Note: Mutating ReadableSpan attributes is technically a contract violation,
// but necessary for sanitization before export. This is a common pattern.
// If OTEL SDK changes break this, consider using a custom SpanExporter instead.

Alternatively, move sanitization to onStart where the span is still mutable, though this would require detecting which spans are tool calls before execution completes.

Refs:

@github-actions github-actions bot deleted a comment from claude bot Apr 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

Preview URLs

Use these stable preview aliases for testing this PR:

These point to the same Vercel preview deployment as the bot comment, but they stay stable and easier to find.

Raw Vercel deployment URLs

@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 8, 2026

Ito Test Report ✅

14 test cases ran. 14 passed.

The unified run passed all 14 of 14 test cases with zero failures, confirming stable non-production behavior after local environment/auth remediation and deterministic project-agent seeding. Most importantly, core run-domain chat streaming (including OpenAI-style follow-ups) and durable execution reconnect worked end-to-end, selector-chain functionality and hardening checks behaved as designed (nested and optional $select, reference/schema handling, controlled errors, and proto blocking), and resilience flows across rapid desktop actions, mobile viewport use, and refresh/back-forward interruptions remained recoverable and usable.

✅ Passed (14)
Category Summary Screenshot
Adversarial Selector guard behavior for __proto__ is implemented and verified in code/tests; no app bug confirmed. ADV-1
Adversarial Refresh plus back/forward interruption path recovered and allowed follow-up interaction in the playground. ADV-5
Edge result. prefix normalization is handled automatically during selector evaluation. EDGE-1
Edge Controlled no-match selector error is implemented and validated; no application bug confirmed. EDGE-2
Edge Invalid JMESPath syntax is handled as a controlled ToolChainResolutionError; no application bug confirmed. EDGE-3
Edge Wrong-type selector mismatch guidance exists in runtime validation and behaves as expected. EDGE-4
Edge Rapid submit/clear/resubmit flow remained usable after environment recovery, with successful post-clear submission. EDGE-5
Edge Mobile playground flow remained functional at 390x844, including prompt and follow-up submission after clearing chat. EDGE-6
Logic Selector resolution for nested tool fields is implemented and behaving as expected. LOGIC-1
Logic Artifact and tool references with optional $select resolve correctly. LOGIC-2
Logic Ref-aware schema accepts reference objects in primitive parameter positions as designed. LOGIC-3
Happy-path /run/api/chat returned a successful SSE stream (start, text, completion) after correcting local setup prerequisites. ROUTE-1
Happy-path OpenAI-style chat completions streamed successfully for initial and follow-up requests using the same conversationId. ROUTE-2
Happy-path Durable execution returned a workflow run id and reconnect streaming returned SSE payloads successfully. ROUTE-3

Commit: c929c40

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

This is a delta review scoped to changes since the prior automated review (commit c929c408a5b1).

Prior Review Issues — All Addressed ✅

The PR author addressed all actionable items from the prior review in commits 3d2371718 and 4e20019ad:

Prior Issue Status Resolution
🟠 Major: JSDoc comment misplaced in select-filter.ts ✅ Fixed JSDoc now correctly precedes stripInternalFields
🟠 Major: DANGEROUS_PATTERNS duplicated from agents-core ✅ Fixed Now imports from @inkeep/agents-core/utils/signature-validation
🟠 Major: Silent schema building failure in function-tools.ts ✅ Fixed Added logger.warn with tool name and error details
🟠 Major: Silent schema transformation failure in mcp-tools.ts ✅ Fixed Added logger.warn with error details
🟡 Minor: Empty catch block in instrumentation.ts ✅ Fixed Added explanatory comment: "Expected for non-JSON tool results"
🟡 Minor: Mutating ReadableSpan attributes ⚪ Not addressed Acceptable — this is a known OTEL pattern
🧹 Dead template files should be deleted ✅ Resolved Template files are now USED — PromptConfig.ts imports them instead of inlining

Merged Main Changes

The delta also includes several commits merged from main that are outside this PR's scope:

  • feat(artifacts): binary attachment artifacts and blob-backed artifact hydration (#2745)
  • fix: key durable approval decisions by toolCallId instead of toolName (#3062)
  • fix: handle SSE failure when streaming approval request (#3064)
  • refactor(test): create reusable logger mock factory (#3071)
  • refactor(agents-core): remove logger DI from agentFull and projectFull (#3072)

These are independent changes already reviewed in their respective PRs.

💭 Consider (Carried Forward)

The following items from the prior review remain as optional improvements:

💭 1) select-filter.ts Missing unit tests for stripInternalFields

The stripInternalFields function has no unit tests. This is a nice-to-have for completeness but not blocking.

💭 2) resolveArgs-select.test.ts Test mocks alignment

Tests mock getToolResultRaw for $select scenarios, but implementation calls getToolResultFull when $select is present. Works due to fallback but could be cleaner.


✅ APPROVE

Summary: Excellent work addressing the prior review feedback! All 4 Major issues and 1 Minor issue have been resolved. The template files are now properly imported rather than inlined, which also resolves the "dead files" concern. The $select JMESPath filtering feature is well-implemented with proper security validation, error handling, and comprehensive test coverage. Ready to merge. 🚀

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 7 0 2 0 0 0 0
Total 7 0 2 0 0 0 0

Note: Delta review — sub-reviewers not dispatched since prior issues were addressed.

@itoqa
Copy link
Copy Markdown

itoqa bot commented Apr 8, 2026

Ito Test Report ✅

10 test cases ran. 1 additional finding, 9 passed.

Overall, the unified QA run completed 10 test cases with 9 passing and 1 failing, with the only confirmed defect being a medium-severity conversation continuity bug in the Manage UI playground where refresh/back-forward/multi-tab recovery during streaming rotates the conversation ID and loses in-progress thread context. All other critical paths were validated as expected—including /run/v1/chat/completions tool-chaining, correct $select behavior (full-payload passthrough when absent, result.-prefix normalization, and actionable type-mismatch guidance), and robust selector hardening against dangerous, overlong, and malformed inputs—while rapid-submit recovery and mobile chained-chat usability remained stable.

✅ Passed (9)
Category Summary Screenshot
Adversarial Dangerous selectors are rejected before execution; no exploitable injection path found. ADV-1
Adversarial Selectors over 1000 characters are rejected with deterministic validation. ADV-2
Adversarial Malformed JMESPath returns a controlled error and follow-up resolution remains stable. ADV-3
Edge Rapid-submit stress remained recoverable: burst submit did not create duplicate run requests, and chat controls recovered after Clear. EDGE-3
Edge Mobile chained chat completed successfully on iPhone-sized viewport and controls remained usable after streaming. EDGE-4
Logic No-$select references correctly preserve full tool/artifact payloads based on code-path verification and regression coverage. LOGIC-1
Logic result.-prefixed selectors normalize correctly and resolve like unprefixed selectors based on code-path verification and regression coverage. LOGIC-2
Logic Type-mismatch selector guidance exists and is actionable; no product bug confirmed for this case. LOGIC-3
Happy-path Verified non-bug: /run/v1/chat/completions and $select resolution passed targeted re-validation. ROUTE-1
ℹ️ Additional Findings (1)

These findings are unrelated to the current changes but were observed during testing.

Category Summary Screenshot
Adversarial 🟠 After mid-stream refresh and multi-tab navigation, conversation continuity was lost even though the UI stayed responsive. ADV-5
🟠 Unusual journey stress: refresh/back-forward and multi-tab retry during stream
  • What failed: The interface remained usable, but the in-progress conversation was not preserved across refresh; expected behavior for this stress case is continuity within the same conversation after recovery actions.
  • Impact: Users can lose active conversation context during refresh recovery flows and cannot reliably continue the same thread. This undermines trust in resilience for long or critical in-progress runs.
  • Steps to reproduce:
    1. Open an agent's Try it panel and send a prompt that starts streaming.
    2. Refresh the tab before the response fully completes.
    3. Open a second tab for the same agent page and perform a back/forward cycle.
    4. Return to the original tab and send a follow-up prompt.
    5. Observe that the prior in-progress conversation context is not preserved.
  • Stub / mock context: The run used a localhost development session shortcut for playground authentication, but this scenario executed chat interactions against the local app without route-level response stubs for the continuity behavior under test.
  • Code analysis: I reviewed the playground lifecycle and conversation wiring in agents-manage-ui. The playground unmount cleanup always resets the conversation ID, the reset action generates a brand-new ID, and the embedded chat is hard-bound to that ID; this directly explains why refresh breaks continuity even when the rest of the UI recovers.
  • Why this is likely a bug: Refresh/unmount deterministically rotates the conversation ID, so continuity loss is a direct result of production UI state logic rather than ambiguous test behavior.

Relevant code:

agents-manage-ui/src/components/agent/playground/playground.tsx (lines 52-55)

useEffect(() => {
  // when the playground is closed the chat widget is unmounted so we need to reset the conversation id
  return () => resetPlaygroundConversationId();
}, []);

agents-manage-ui/src/features/agent/state/use-agent-store.ts (lines 244-246)

resetPlaygroundConversationId() {
  set({ playgroundConversationId: generateId() });
},

agents-manage-ui/src/components/agent/playground/chat-widget.tsx (lines 240-243)

isChatHistoryButtonVisible: false,
isViewOnly: hasHeadersError,
conversationIdOverride: conversationId,
baseUrl: PUBLIC_INKEEP_AGENTS_API_URL,

Commit: a31de87

View Full Run


Tell us how we did: Give Ito Feedback

@tim-inkeep tim-inkeep added this pull request to the merge queue Apr 9, 2026
Merged via the queue into main with commit 4e0f7c4 Apr 9, 2026
26 checks passed
@tim-inkeep tim-inkeep deleted the feature/tool-chaining-selector branch April 9, 2026 19:13
inkeep bot added a commit that referenced this pull request Apr 9, 2026
Documents the new $select parameter for tool chaining references
($tool and $artifact) introduced in PR #3056. Covers JMESPath syntax,
common patterns, and automatic result. prefix stripping.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant