Skip to content

Conversation

@caozhiyuan
Copy link
Contributor

@caozhiyuan caozhiyuan commented Jan 12, 2026

This pull request introduces a new configuration system, structured logging, and support for the /v1/responses endpoint, and support for the claude native message api, along with improvements to model selection and request handling. The most important changes are grouped below:

Responses API Integration:

  • Added full support for the /v1/responses endpoint, including a new handler (src/routes/responses/handler.ts) that validates model support, streams or returns results, and logs all activity.
  • Enhanced src/routes/messages/handler.ts to route requests to the Responses API when supported by the selected model, including translation logic for payloads and results.
  • Updated the API documentation in README.md to include the new /v1/responses endpoint and clarify its purpose.

Claude Native Message API:

  • support Claude Native Message API

Configuration Management:

  • Added a new src/lib/config.ts module to provide persistent application configuration, including support for model-specific prompts, reasoning effort levels, and default model selection. Configuration is stored in a new config.json file in the app data directory, with automatic creation and safe permissions. [1] [2]

Logging Improvements:

  • Implemented a new buffered, file-based logging utility in src/lib/logger.ts for handler-level logging, with log rotation, retention, and structured output. Integrated this logger into key request handlers for better diagnostics. [1] [2] [3] [4] [5]

Token Counting Logic:

  • Refactored token counting in src/lib/tokenizer.ts to more accurately account for tool calls, array parameters, and model-specific behaviors (including GPT and Anthropic/Grok models). Added support for excluding certain schema keys and improved calculation for nested parameters. [1] [2] [3] [4] [5] [6] [7] [8]

Fix Credit Consumption Inconsistency:

  • fix inconsistent credit consumption in chat and Merge tool_result and text blocks into tool_result to avoid consuming premium requests

caozhiyuan and others added 27 commits September 27, 2025 13:43
…arsing and allign with vscode-copilot-chat extractThinkingData, otherwise it will cause miss cache occasionally
…ing small model if no tools are used

2.add bun idleTimeout = 0
3.feat: Compatible with Claude code JSONL file usage error scenarios, delay closeBlockIfOpen and map responses api to anthropic  support tool_use and fix spelling errors
4.feat: add configuration management with extra prompt handling and ensure config file creation
…just runServer to set verbose level correctly
…adjusting input token calculations and handling tool prompts
Some clients, like RooCode may send `service_tier` to `/responses` endpoint, but Copilot do not support this field and returns error
@caozhiyuan
Copy link
Contributor Author

@ericc-ch also fix inconsistent credit consumption in chat and adapter claude code skill tool_result. opencode had fixed it.

@caozhiyuan
Copy link
Contributor Author

Also supports the vscode extension, not sure if you need it: https://github.com/caozhiyuan/copilot-api/tree/feature/vscode-extension. Does not depend on bun.

…uming premium requests

(caused by skill invocations, edit hooks or to do reminders)
@getaaron
Copy link

@ericc-ch this looks like a great improvement, can you please merge?

cuipengfei and others added 4 commits January 18, 2026 18:33
GitHub Copilot's Responses API returns different IDs for the same item
in 'added' vs 'done' events, which causes @ai-sdk/openai to throw errors:
- 'activeReasoningPart.summaryParts' undefined
- 'text part not found'

This fix:
- Tracks IDs from 'added' events and reuses them in 'done' events
- Removes empty summary arrays from reasoning items that cause AI SDK parsing issues
- Handles output_item, content_part, output_text, and response.completed events
- Synchronizes item_id for message-type outputs across all related events
… API , simpler version

* fix: sync stream IDs for @ai-sdk/openai compatibility with Responses API

GitHub Copilot's Responses API returns different IDs for the same item
in 'added' vs 'done' events, which causes @ai-sdk/openai to throw errors:
- 'activeReasoningPart.summaryParts' undefined
- 'text part not found'

This fix:
- Tracks IDs from 'added' events and reuses them in 'done' events
- Removes empty summary arrays from reasoning items that cause AI SDK parsing issues
- Handles output_item, content_part, output_text, and response.completed events
- Synchronizes item_id for message-type outputs across all related events

* simpler version of #72
@hgcode1130
Copy link

We need wire_api = "responses". Hope these willl be merge soon

@FlorianBruniaux
Copy link

✅ Successfully tested with Claude Code CLI %

Thanks @caozhiyuan for this excellent work on the /responses endpoint support!

We extensively tested your fork with our project cc-copilot-bridge - a multi-provider wrapper for Claude Code CLI that
switches between Anthropic, GitHub Copilot, and Ollama.

Test Results: 6/6 Passed ✅

Model Test Result
gpt-5.2-codex Simple prompt ✅ Pass (Extended Thinking works!)
gpt-5.1-codex Simple prompt ✅ Pass
gpt-5.1-codex-mini Simple prompt ✅ Pass
gpt-5.1-codex-max Simple prompt ✅ Pass
gpt-5 (regression) Simple prompt ✅ Pass
claude-sonnet-4.5 (regression) Simple prompt ✅ Pass

What we tested

  • All 5 Codex models work without the 400: not accessible via /chat/completions error
  • No regressions on existing models (Claude, GPT-5, GPT-4.1, Gemini)
  • Extended Thinking feature works on gpt-5.2-codex (premium feature)
  • Response times: 1-5 seconds (comparable to non-Codex models)

Our setup

We created a fork launcher script that:

  1. Clones your branch automatically
  2. Builds with bun install && bun run build
  3. Runs the proxy on port 4141
  4. Auto-detects when this PR is merged to switch back to official release

Script: launch-responses-fork.sh

Recommendation

Strongly recommend merging this PR. It unlocks all Codex models for Claude Code users via Copilot, which is a significant improvement.

We've documented our findings in detail here:

Thanks again for the great work! 🚀

FlorianBruniaux added a commit to FlorianBruniaux/cc-copilot-bridge that referenced this pull request Jan 23, 2026
- CHANGELOG.md: Add v1.5.0 section documenting Codex models via fork
- README.md: Add "GPT Codex Models" section with setup instructions
- CLAUDE.md: Update Model Compatibility Matrix (Codex now supported)
- scripts/VERSION: Bump to 1.5.0

PR tracking: ericc-ch/copilot-api#170

Co-Authored-By: Claude <noreply@anthropic.com>
@caozhiyuan caozhiyuan changed the title support responses api (openAI's new generation API supports model thinking) , fix inconsistent credit consumption in chat and adapter claude code skill tool_result support responses api (openAI's new generation API supports model thinking) , support native message-api, fix inconsistent credit consumption in chat Jan 25, 2026
@caozhiyuan caozhiyuan changed the title support responses api (openAI's new generation API supports model thinking) , support native message-api, fix inconsistent credit consumption in chat support responses api , support native message-api, fix inconsistent credit consumption in chat Jan 25, 2026
@zhujian0805
Copy link

nice feature, i have beening test it and work as expected:
i have dockenized it this is my repo: https://github.com/Chat2AnyLLM/copilot-api-nginx-proxy.git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants