Skip to content

Comments

feat: Add Node.js Claude service for official SDK access#203

Draft
marccampbell wants to merge 1 commit intomainfrom
chartsmith/claude-node-service
Draft

feat: Add Node.js Claude service for official SDK access#203
marccampbell wants to merge 1 commit intomainfrom
chartsmith/claude-node-service

Conversation

@marccampbell
Copy link
Member

Summary

Phase 1 implementation of #202 — Extract LLM calls to Node.js Claude service.

This PR adds the foundation for routing Claude API calls through a Node.js service that uses the official Anthropic SDK, enabling access to SDK-exclusive features like prompt caching and extended thinking.

Changes

New: claude-service/ (Node.js)

  • Express server wrapping official @anthropic-ai/sdk
  • Endpoints:
    • POST /v1/messages — non-streaming completion
    • POST /v1/messages/stream — SSE streaming
    • POST /v1/messages/think — extended thinking (Claude 3.7+)
    • GET /health — health check
  • Dockerfile for containerized deployment
  • Full TypeScript with Zod request validation

New: Go Claude client (pkg/llm/claude/)

  • HTTP client to call the Node service
  • Supports streaming via SSE parsing
  • StreamMessage() returns text channel
  • StreamMessageWithResponse() returns text + final response

Updated: pkg/llm/expand.go

  • Checks CLAUDE_SERVICE_URL environment variable
  • If set, routes through Node service
  • Otherwise falls back to direct Anthropic Go SDK

Usage

# Start the Node service
cd claude-service && npm install && npm run dev

# In another terminal, run Go worker with service URL
export CLAUDE_SERVICE_URL=http://localhost:3100
make run-worker

Testing

  • TypeScript compiles (npm run typecheck)
  • Go compiles (go build ./...)
  • Integration test with live API (needs API key)

Next Steps (Future PRs)

  • Migrate remaining streaming functions (conversational.go, execute-action.go, etc.)
  • Add docker-compose integration
  • Enable prompt caching for system prompts
  • Add extended thinking support

Closes phase 1 of #202

Phase 1 of issue #202 - Extract LLM calls to Node.js Claude service.

This PR adds:

## New: claude-service (Node.js)
- Express server wrapping official @anthropic-ai/sdk
- Endpoints:
  - POST /v1/messages (non-streaming)
  - POST /v1/messages/stream (SSE streaming)
  - POST /v1/messages/think (extended thinking)
  - GET /health
- Dockerfile for containerized deployment
- Full TypeScript with Zod validation

## New: Go Claude client (pkg/llm/claude/)
- HTTP client to call the Node service
- Supports streaming via SSE parsing
- StreamMessage() and StreamMessageWithResponse() methods

## Updated: pkg/llm/expand.go
- Now checks CLAUDE_SERVICE_URL env var
- If set, routes through Node service
- Otherwise falls back to direct Anthropic SDK

## Usage
Set CLAUDE_SERVICE_URL=http://localhost:3100 to enable.

Next steps (future PRs):
- Migrate remaining streaming functions
- Add docker-compose integration
- Enable prompt caching
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

res.end();
} catch (err) {
next(err);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended thinking endpoint sends JSON after SSE headers

High Severity

The /v1/messages/think endpoint sets SSE headers (lines 141-144), but errors are passed to next(err) (line 178) which invokes the error middleware that attempts to send JSON via res.status().json(). After SSE headers are sent, this causes a "headers already sent" error or malformed response. The handleStreaming function correctly handles this by catching errors and sending them as SSE events instead.

Additional Locations (1)

Fix in Cursor Fix in Web

defer close(errCh)
defer resp.Body.Close()

scanner := bufio.NewScanner(resp.Body)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StreamMessage missing buffer increase causes large response failures

Medium Severity

StreamMessage uses the default bufio.Scanner with a 64KB max line size, while StreamMessageWithResponse explicitly increases the buffer to 1MB. The scanner reads all SSE lines including message_stop events (containing the full response JSON), which can exceed 64KB for longer outputs. This causes StreamMessage to fail with a scanner error on larger responses, even though it only processes small text deltas.

Additional Locations (1)

Fix in Cursor Fix in Web

@scottrigby scottrigby marked this pull request as draft February 4, 2026 15:36
@scottrigby
Copy link
Contributor

Converting PR to Draft because this is still in progress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants