feat: Add Node.js Claude service for official SDK access#203
feat: Add Node.js Claude service for official SDK access#203marccampbell wants to merge 1 commit intomainfrom
Conversation
Phase 1 of issue #202 - Extract LLM calls to Node.js Claude service. This PR adds: ## New: claude-service (Node.js) - Express server wrapping official @anthropic-ai/sdk - Endpoints: - POST /v1/messages (non-streaming) - POST /v1/messages/stream (SSE streaming) - POST /v1/messages/think (extended thinking) - GET /health - Dockerfile for containerized deployment - Full TypeScript with Zod validation ## New: Go Claude client (pkg/llm/claude/) - HTTP client to call the Node service - Supports streaming via SSE parsing - StreamMessage() and StreamMessageWithResponse() methods ## Updated: pkg/llm/expand.go - Now checks CLAUDE_SERVICE_URL env var - If set, routes through Node service - Otherwise falls back to direct Anthropic SDK ## Usage Set CLAUDE_SERVICE_URL=http://localhost:3100 to enable. Next steps (future PRs): - Migrate remaining streaming functions - Add docker-compose integration - Enable prompt caching
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| res.end(); | ||
| } catch (err) { | ||
| next(err); | ||
| } |
There was a problem hiding this comment.
Extended thinking endpoint sends JSON after SSE headers
High Severity
The /v1/messages/think endpoint sets SSE headers (lines 141-144), but errors are passed to next(err) (line 178) which invokes the error middleware that attempts to send JSON via res.status().json(). After SSE headers are sent, this causes a "headers already sent" error or malformed response. The handleStreaming function correctly handles this by catching errors and sending them as SSE events instead.
Additional Locations (1)
| defer close(errCh) | ||
| defer resp.Body.Close() | ||
|
|
||
| scanner := bufio.NewScanner(resp.Body) |
There was a problem hiding this comment.
StreamMessage missing buffer increase causes large response failures
Medium Severity
StreamMessage uses the default bufio.Scanner with a 64KB max line size, while StreamMessageWithResponse explicitly increases the buffer to 1MB. The scanner reads all SSE lines including message_stop events (containing the full response JSON), which can exceed 64KB for longer outputs. This causes StreamMessage to fail with a scanner error on larger responses, even though it only processes small text deltas.
Additional Locations (1)
|
Converting PR to Draft because this is still in progress. |


Summary
Phase 1 implementation of #202 — Extract LLM calls to Node.js Claude service.
This PR adds the foundation for routing Claude API calls through a Node.js service that uses the official Anthropic SDK, enabling access to SDK-exclusive features like prompt caching and extended thinking.
Changes
New:
claude-service/(Node.js)@anthropic-ai/sdkPOST /v1/messages— non-streaming completionPOST /v1/messages/stream— SSE streamingPOST /v1/messages/think— extended thinking (Claude 3.7+)GET /health— health checkNew: Go Claude client (
pkg/llm/claude/)StreamMessage()returns text channelStreamMessageWithResponse()returns text + final responseUpdated:
pkg/llm/expand.goCLAUDE_SERVICE_URLenvironment variableUsage
Testing
npm run typecheck)go build ./...)Next Steps (Future PRs)
Closes phase 1 of #202