Skip to content

Conversation

@codyde
Copy link
Contributor

@codyde codyde commented Oct 1, 2025

Summary

Adds Sentry tracing instrumentation for the @anthropic-ai/claude-agent-sdk (Claude Code Agent SDK) following OpenTelemetry Semantic Conventions for Generative AI.

This integration enables AI monitoring for Claude Code agents with comprehensive telemetry:

  • Agent invocation spans (gen_ai.invoke_agent)
  • LLM chat spans (gen_ai.chat)
  • Tool execution spans (gen_ai.execute_tool)
  • Token usage tracking (including cache metrics)
  • Model info and session tracking
  • Optional input/output recording (respects sendDefaultPii)

Implementation

Uses automatic OpenTelemetry instrumentation via import-in-the-middle hooks - the same pattern as other AI integrations (Anthropic, OpenAI, Vercel AI, etc.). When the integration is added, it automatically patches the query function from @anthropic-ai/claude-agent-sdk.

Important: Sentry must be initialized before importing @anthropic-ai/claude-agent-sdk for auto-instrumentation to work.

Usage

// Initialize Sentry FIRST
import * as Sentry from '@sentry/node';

Sentry.init({
  dsn: 'your-dsn',
  integrations: [
    Sentry.claudeCodeAgentSdkIntegration({
      recordInputs: true,
      recordOutputs: true,
      agentName: 'my-coding-assistant', // optional
    }),
  ],
});

// THEN import the SDK - it will be automatically instrumented!
import { query } from '@anthropic-ai/claude-agent-sdk';

// Use query as normal - spans are created automatically
for await (const message of query({
  prompt: 'Hello',
  options: { model: 'claude-sonnet-4-20250514' },
})) {
  console.log(message);
}

Options

Option Type Default Description
recordInputs boolean sendDefaultPii Whether to record prompt messages
recordOutputs boolean sendDefaultPii Whether to record response text, tool calls, and outputs
agentName string 'claude-code' Custom agent name for span identification

Captured Telemetry

Span Hierarchy

invoke_agent claude-code
├── chat claude-sonnet-4-20250514
│   └── execute_tool Read
│   └── execute_tool Bash
├── chat claude-sonnet-4-20250514
│   └── execute_tool WebSearch
└── ...

Attributes (OpenTelemetry GenAI Semantic Conventions)

  • gen_ai.system: anthropic
  • gen_ai.operation.name: invoke_agent | chat | execute_tool
  • gen_ai.agent.name: Custom or claude-code
  • gen_ai.request.model: Model identifier
  • gen_ai.request.available_tools: Available tools from system init
  • gen_ai.response.id: Response/session ID
  • gen_ai.response.model: Actual model used
  • gen_ai.response.finish_reasons: Stop reason
  • gen_ai.response.text: Response text (when recordOutputs: true)
  • gen_ai.response.tool_calls: Tool calls made (when recordOutputs: true)
  • gen_ai.tool.name: Tool name (e.g., Read, Bash, WebSearch)
  • gen_ai.tool.type: function | extension | datastore
  • gen_ai.tool.input: Tool input (when recordInputs: true)
  • gen_ai.tool.output: Tool output (when recordOutputs: true)
  • gen_ai.usage.input_tokens: Input token count
  • gen_ai.usage.output_tokens: Output token count
  • gen_ai.usage.cache_creation_input_tokens: Cache creation tokens
  • gen_ai.usage.cache_read_input_tokens: Cache read tokens

Tool Type Classification

  • Function tools (client-side execution): Bash, Read, Write, Edit, Glob, Grep, Task, TodoWrite, NotebookEdit, SlashCommand, AskUserQuestion, Skill, etc.
  • Extension tools (external APIs): WebSearch, WebFetch, ListMcpResources, ReadMcpResource

Manual Instrumentation

For advanced use cases where auto-instrumentation is not suitable, patchClaudeCodeQuery is exported for manual patching:

import { patchClaudeCodeQuery } from '@sentry/node';
import { query } from '@anthropic-ai/claude-agent-sdk';

const instrumentedQuery = patchClaudeCodeQuery(query, {
  recordInputs: true,
  recordOutputs: true,
  agentName: 'my-agent',
});

Testing

Includes comprehensive integration tests covering:

  • Basic agent invocation with default PII settings
  • Input/output recording with sendDefaultPii: true
  • Custom recordInputs/recordOutputs options
  • Tool execution spans (function and extension types)
  • Error handling and span status

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@codyde codyde marked this pull request as draft October 1, 2025 23:17
@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

size-limit report 📦

Path Size % Change Change
@sentry/browser 25.2 kB - -
@sentry/browser - with treeshaking flags 23.71 kB - -
@sentry/browser (incl. Tracing) 42.02 kB - -
@sentry/browser (incl. Tracing, Profiling) 46.66 kB - -
@sentry/browser (incl. Tracing, Replay) 80.63 kB - -
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 70.28 kB - -
@sentry/browser (incl. Tracing, Replay with Canvas) 85.32 kB - -
@sentry/browser (incl. Tracing, Replay, Feedback) 97.53 kB - -
@sentry/browser (incl. Feedback) 41.92 kB - -
@sentry/browser (incl. sendFeedback) 29.89 kB - -
@sentry/browser (incl. FeedbackAsync) 34.89 kB - -
@sentry/browser (incl. Metrics) 26.31 kB - -
@sentry/browser (incl. Logs) 26.46 kB - -
@sentry/browser (incl. Metrics & Logs) 27.11 kB - -
@sentry/react 26.93 kB - -
@sentry/react (incl. Tracing) 44.26 kB - -
@sentry/vue 29.64 kB - -
@sentry/vue (incl. Tracing) 43.82 kB - -
@sentry/svelte 25.22 kB - -
CDN Bundle 27.78 kB - -
CDN Bundle (incl. Tracing) 42.83 kB - -
CDN Bundle (incl. Tracing, Logs, Metrics) 43.65 kB - -
CDN Bundle (incl. Tracing, Replay) 79.53 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) 84.97 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) 85.89 kB - -
CDN Bundle - uncompressed 81.27 kB - -
CDN Bundle (incl. Tracing) - uncompressed 126.81 kB - -
CDN Bundle (incl. Tracing, Logs, Metrics) - uncompressed 129.65 kB - -
CDN Bundle (incl. Tracing, Replay) - uncompressed 243.35 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 256.15 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback, Logs, Metrics) - uncompressed 258.96 kB - -
@sentry/nextjs (client) 46.62 kB - -
@sentry/sveltekit (client) 42.39 kB - -
@sentry/node-core 51.9 kB - -
@sentry/node 166.93 kB +0.9% +1.48 kB 🔺
@sentry/node - without tracing 93.66 kB - -
@sentry/aws-serverless 109.16 kB - -

View base workflow run

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

node-overhead report 🧳

Note: This is a synthetic benchmark with a minimal express app and does not necessarily reflect the real-world performance impact in an application.

Scenario Requests/s % of Baseline Prev. Requests/s Change %
GET Baseline 11,409 - 8,944 +28%
GET With Sentry 1,913 17% 1,786 +7%
GET With Sentry (error only) 7,576 66% 6,223 +22%
POST Baseline 1,164 - 1,206 -3%
POST With Sentry 577 50% 597 -3%
POST With Sentry (error only) 1,025 88% 1,064 -4%
MYSQL Baseline 3,972 - 3,304 +20%
MYSQL With Sentry 463 12% 513 -10%
MYSQL With Sentry (error only) 3,277 83% 2,667 +23%

View base workflow run

@codyde codyde changed the title feat(node): Add Claude Code Agent SDK instrumentation feat(agent-monitoring): Add Claude Code Agent SDK instrumentation Oct 2, 2025
@codyde codyde changed the title feat(agent-monitoring): Add Claude Code Agent SDK instrumentation feat(javascript): Add Claude Code Agent SDK instrumentation Oct 2, 2025
@RulaKhaled RulaKhaled assigned RulaKhaled and unassigned RulaKhaled Oct 6, 2025
@RulaKhaled RulaKhaled self-requested a review October 6, 2025 08:17
Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! For the first pass, the biggest lift here is to try to auto patch the functions we need automatically instead of asking user to import patched method, then we can move to tackling the other TODOs you have

@codyde
Copy link
Contributor Author

codyde commented Oct 6, 2025

Thanks for working on this! For the first pass, the biggest lift here is to try to auto patch the functions we need automatically instead of asking user to import patched method, then we can move to tackling the other TODOs you have

Thanks SO much for all these. I'll get started on them.

I tried REALLY hard to figure out how to hook into the existing query, and I couldn't get it to work no matter what I tried. I'll chat with you in slack on it, but I'd love some advice / guidance. I tried a bunch of different angles - but each time I ran into effectively timing issues where we couldn't hook fast enough. Felt like a limitation on how Claude Code's SDK works - but could be a total skill issue on my side.

@RulaKhaled
Copy link
Member

Thanks for working on this! For the first pass, the biggest lift here is to try to auto patch the functions we need automatically instead of asking user to import patched method, then we can move to tackling the other TODOs you have

Thanks SO much for all these. I'll get started on them.

I tried REALLY hard to figure out how to hook into the existing query, and I couldn't get it to work no matter what I tried. I'll chat with you in slack on it, but I'd love some advice / guidance. I tried a bunch of different angles - but each time I ran into effectively timing issues where we couldn't hook fast enough. Felt like a limitation on how Claude Code's SDK works - but could be a total skill issue on my side.

Hello @codyde, are you still working on this? if not, let's close this, we're trying to clean up the stale PRs

@codyde
Copy link
Contributor Author

codyde commented Oct 20, 2025

Thanks for working on this! For the first pass, the biggest lift here is to try to auto patch the functions we need automatically instead of asking user to import patched method, then we can move to tackling the other TODOs you have

Thanks SO much for all these. I'll get started on them.
I tried REALLY hard to figure out how to hook into the existing query, and I couldn't get it to work no matter what I tried. I'll chat with you in slack on it, but I'd love some advice / guidance. I tried a bunch of different angles - but each time I ran into effectively timing issues where we couldn't hook fast enough. Felt like a limitation on how Claude Code's SDK works - but could be a total skill issue on my side.

Hello @codyde, are you still working on this? if not, let's close this, we're trying to clean up the stale PRs

I definitely am! I pushed up a few more commits today that included fixes for some of the other items you mentioned - but im struggling to get through this proxy one. I might need some pairing time to take a look at it together since im less familiar with the functionality.

@codyde codyde force-pushed the claude-code-agent-instrumentation branch from 4df75cc to 6e267e0 Compare November 26, 2025 07:42
@codyde codyde force-pushed the claude-code-agent-instrumentation branch from 6e267e0 to 3f69bfd Compare December 28, 2025 06:16
@codyde codyde marked this pull request as ready for review December 28, 2025 06:18
@codyde codyde changed the title feat(javascript): Add Claude Code Agent SDK instrumentation feat(node): Add Claude Code Agent SDK instrumentation Dec 28, 2025
Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a lot better! Thank you :)

I’ve left some comments, and I’ll continue the review next week—mainly to make sure we’re not missing any attributes and that the tests are passing. We’re so close to getting this in.

Copy link
Contributor Author

@codyde codyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental comment

Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok can we fix lint issues/tests and resolve cursor comments? Meanwhile I will be testing this locally with both esm/cjs

codyde and others added 23 commits January 21, 2026 08:40
…umentation

- Add OpenTelemetry-based automatic instrumentation via SentryClaudeCodeAgentSdkInstrumentation
- Extract ClaudeCodeOptions to dedicated types.ts file
- Remove backwards compatibility exports (patchClaudeCodeQuery, createInstrumentedClaudeQuery)
- Rename integration to claudeCodeAgentSdkIntegration
- Register instrumentation in OTEL preload for automatic patching
- Update NextJS re-exports to match simplified API

Users now only need:
```typescript
Sentry.init({ integrations: [Sentry.claudeCodeAgentSdkIntegration()] });
import { query } from '@anthropic-ai/claude-agent-sdk'; // Auto-instrumented
```

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…ec compliance

- Fix GEN_AI_SYSTEM_ATTRIBUTE to use 'anthropic' per OpenTelemetry semantic conventions
- Add GEN_AI_REQUEST_AVAILABLE_TOOLS_ATTRIBUTE for capturing available tools from system init
- Add GEN_AI_RESPONSE_FINISH_REASONS_ATTRIBUTE for tracking stop_reason
- Use getTruncatedJsonString for proper payload truncation in span attributes
- Expand tool categorization with new tools (KillBash, EnterPlanMode, AskUserQuestion, Skill, MCP tools)
- Add better error metadata with function name in mechanism data
- Export patchClaudeCodeQuery for manual instrumentation use cases
- Add comprehensive integration tests for Claude Code Agent SDK instrumentation
Follow best practice pattern used by other AI integrations (OpenAI, Anthropic)
where options are passed directly to instrumentClaudeCodeAgentSdk() rather
than exposed on the integration object.
- Remove patchClaudeCodeQuery from public exports to match pattern of other
  AI integrations (OpenAI, Anthropic, etc.) which only expose the integration
- Change SENTRY_ORIGIN from 'auto.ai.claude-code' to 'auto.ai.claude_code'
  to follow Sentry naming conventions (underscores instead of hyphens)
- Update integration tests to match new origin naming
- Remove claudeCodeAgentSdkIntegration from nextjs index.types.ts
- Rename otel-instrumentation.ts to instrumentation.ts
- Rename instrumentation.ts to helpers.ts (matches other AI integrations)
- Fix prompt capture: use 'prompt' instead of 'inputMessages' per SDK API
- Add cache token attribute support for tracking cached/cache_write tokens
- Export new cache attributes from @sentry/core
- Fix startSpanManual usage to follow OpenAI/Anthropic pattern:
  - Use regular callback that returns the generator
  - Separate generator function handles span lifecycle in finally block
- Add accumulative token usage on invoke_agent span
- Clean up console.logs in test scenario
…helper

- Add patchClaudeCodeQuery to scenario-tools.mjs and scenario-errors.mjs
- Export patchClaudeCodeQuery from index.ts for test usage
- Replace Math.random() with deterministic session ID in mock-server.mjs
…us propagation

- Each assistant message now creates its own chat span instead of merging multiple turns
- Token usage is recorded on each individual chat span from the assistant message
- Child spans (currentLLMSpan, previousLLMSpan) now inherit error status when parent fails
- Moved token accumulation from result message to assistant message handling
- Export patchClaudeCodeQuery from packages/node/src/index.ts for public API access
- Add handling for 'error' type messages from Claude Code SDK in the generator loop
- Error messages now properly set encounteredError flag, capture exception, and set span status
Don't overwrite span error status with success at end of try block.
The unconditional span.setStatus({ code: 1 }) was overwriting any error
status set when processing 'error' type messages from the SDK.
Prevent calling setStatus() and end() on already-ended spans when
handling multiple assistant messages. This matches the pattern used
in the finally block for consistency.
The invoke_agent span now captures the prompt parameter directly at span
creation time when recordInputs is true. Previously, gen_ai.request.messages
was only set from conversation_history in system messages, which doesn't
exist in the real Claude Agent SDK (only prompt and options are accepted).

This ensures the user's input is properly recorded on the invoke_agent span
in production, matching how other AI SDK integrations (like LangGraph) handle
input capture.
… Code SDK

- Use original error object when handling SDK error messages to preserve
  stack trace instead of creating new Error with just the message
- Add scenario-with-options.mjs that explicitly passes recordInputs/recordOutputs
  options to patchClaudeCodeQuery for proper test coverage
- Simplify getToolType by removing large functionTools Set
- Extract finalizeLLMSpan helper to reduce code duplication
- Remove redundant cache attribute setting (already handled by setTokenUsageAttributes)
- Trim verbose JSDoc comments
- Update size limit from 162 KB to 163 KB for new instrumentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add copyPaths option to copy mock-server.mjs to temp directory
- Simplify error scenario to test single error case
- Split tool tests into separate scenarios for function and extension tools
- Fix test expectations to check correct span locations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The setTokenUsageAttributes function was accepting cache token parameters
but only using them to calculate total tokens. Now it also sets them as
individual span attributes (gen_ai.usage.input_tokens.cache_write and
gen_ai.usage.input_tokens.cached).

Also updates .size-limit.js to use develop branch value.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codyde codyde force-pushed the claude-code-agent-instrumentation branch from 338c488 to a2b9371 Compare January 21, 2026 16:42
- Add missing exports to astro, aws-serverless, bun, and google-cloud-serverless packages
- Fix CJS/ESM compatibility in integration tests by adding mock-server.cjs and updating runner to convert .mjs imports to .cjs
- Optimize helpers.ts to reduce bundle size with shared status constants and helper functions
The develop branch is already at 166.01 KB (over the 166 KB limit).
Claude Code instrumentation adds ~1.4 KB, bringing the total to 167.47 KB.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

The size optimizations (short variable names, helper functions) only saved
~10 bytes after gzip compression but hurt readability.
…s errors

When an error-type message is received from the SDK, we capture the exception
and set encounteredError=true. If the consumer then re-throws that error,
the catch block would capture it again. Now we check encounteredError before
capturing in the catch block to avoid duplicate reports in Sentry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants