Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

## Unreleased

- (OpenAI Chat) - Configurable reasoning history via `keepHistoryReasoning` (model-level, default: prune)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this default for all openai-chat is a good idea or if we should have it enabled only for google provider in config.clj, also shouldn't this be a flag by provider and not model?

Copy link
Member Author

@zikajk zikajk Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what I've found in Opencode and OpenAI SDK it is default for every model.

  1. OpenCode transform <think> block to message of type reasoning and this type is ignored by OpenAI SDK and never reach the LLM.
  2. Delta reasoning messages are usually persisted in one turn and then thrown away (as you can see in GLM docs bellow, first vs second diagram)

It shouldn't be by provider because you might want to configure it only for e.g. GLM-4.7 (second diagram here ->
https://docs.z.ai/guides/capabilities/thinking-mode).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's keep an eye on that if affects any openai-chat model, I still think it's a little bit dangerous move, but if openai sdk is really doing that it may be ok


## 0.92.1

- Add `x-llm-application-name: eca` to prompt requests, useful to track and get metrics when using LLM gateways.
Expand Down
3 changes: 2 additions & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -673,7 +673,8 @@ To configure, add your OTLP collector config via `:otlp` map following [otlp aut
thinkTagEnd?: string;
models: {[key: string]: {
modelName?: string;
extraPayload?: {[key: string]: any}
extraPayload?: {[key: string]: any};
keepHistoryReasoning?: boolean;
}};
}};
defaultModel?: string;
Expand Down
56 changes: 42 additions & 14 deletions docs/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,19 +61,20 @@ You just need to add your provider to `providers` and make sure add the required

Schema:

| Option | Type | Description | Required |
|-------------------------------|--------|--------------------------------------------------------------------------------------------------------------|----------|
| `api` | string | The API schema to use (`"openai-responses"`, `"openai-chat"`, or `"anthropic"`) | Yes |
| `url` | string | API URL (with support for env like `${env:MY_URL}`) | No* |
| `key` | string | API key (with support for `${env:MY_KEY}` or `{netrc:api.my-provider.com}` | No* |
| `completionUrlRelativePath` | string | Optional override for the completion endpoint path (see defaults below and examples like Azure) | No |
| `thinkTagStart` | string | Optional override the think start tag tag for openai-chat (Default: "<think>") api | No |
| `thinkTagEnd` | string | Optional override the think end tag for openai-chat (Default: "</think>") api | No |
| `httpClient` | map | Allow customize the http-client for this provider requests, like changing http version | No |
| `models` | map | Key: model name, value: its config | Yes |
| `models <model> extraPayload` | map | Extra payload sent in body to LLM | No |
| `models <model> modelName` | string | Override model name, useful to have multiple models with different configs and names that use same LLM model | No |
| `fetchModels` | boolean | Enable automatic model discovery from `/models` endpoint (OpenAI-compatible providers) | No |
| Option | Type | Description | Required |
|---------------------------------------|---------|--------------------------------------------------------------------------------------------------------------|----------|
| `api` | string | The API schema to use (`"openai-responses"`, `"openai-chat"`, or `"anthropic"`) | Yes |
| `url` | string | API URL (with support for env like `${env:MY_URL}`) | No* |
| `key` | string | API key (with support for `${env:MY_KEY}` or `{netrc:api.my-provider.com}` | No* |
| `completionUrlRelativePath` | string | Optional override for the completion endpoint path (see defaults below and examples like Azure) | No |
| `thinkTagStart` | string | Optional override the think start tag tag for openai-chat (Default: "<think>") api | No |
| `thinkTagEnd` | string | Optional override the think end tag for openai-chat (Default: "</think>") api | No |
| `httpClient` | map | Allow customize the http-client for this provider requests, like changing http version | No |
| `models` | map | Key: model name, value: its config | Yes |
| `models <model> extraPayload` | map | Extra payload sent in body to LLM | No |
| `models <model> modelName` | string | Override model name, useful to have multiple models with different configs and names that use same LLM model | No |
| `models <model> keepHistoryReasoning` | boolean | Keep `reason` messages in conversation history. Default: `false` | No |
| `fetchModels` | boolean | Enable automatic model discovery from `/models` endpoint (OpenAI-compatible providers) | No |

_* url and key will be searched as envs `<provider>_API_URL` and `<provider>_API_KEY`, they require the env to be found or config to work._

Expand Down Expand Up @@ -120,6 +121,33 @@ Examples:

This way both will use gpt-5 model but one will override the reasoning to be high instead of the default.

=== "History reasoning"
`keepHistoryReasoning` - Determines whether the model's internal reasoning chain is persisted in the conversation history for subsequent turns.

- **Standard Behavior**: Most models expect reasoning blocks (e.g., `<think>` tags or `reasoning_content`) to be removed in subsequent requests to save tokens and avoid bias.
- **Usage**: Enable this for models that explicitly support "preserved thinking," or if you want to experiment with letting the model see its previous thought process (with XML-based reasoning).
- **Example**: See [GLM-4.7 with Preserved thinking](https://docs.z.ai/guides/capabilities/thinking-mode#preserved-thinking).

```javascript title="~/.config/eca/config.json"
{
"providers": {
"z-ai": {
"api": "openai-chat",
"url": "https://api.z.ai/api/paas/v4/",
"key": "your-api-key",
"models": {
"GLM-4.7": {
"keepHistoryReasoning": true, // Preserves reasoning
"extraPayload": {"clear_thinking": false} // Preserved thinking (see https://docs.z.ai/guides/capabilities/thinking-mode#preserved-thinking)
}
}
}
}
}
```

Default: `false`.

=== "Dynamic model discovery"

For OpenAI-compatible providers, set `fetchModels: true` to automatically discover available models:
Expand Down Expand Up @@ -211,7 +239,7 @@ Notes:
3. Type the chosen method
4. Authenticate in your browser, copy the code.
5. Paste and send the code and done!

=== "Codex / Openai"

1. Login to Openai via the chat command `/login`.
Expand Down
2 changes: 1 addition & 1 deletion integration-test/integration/chat/github_copilot_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@
(match-content chat-id "system" {:type "progress" :state "finished"})
(is (match?
{:input [{:role "user" :content [{:type "input_text" :text "hello!"}]}
{:role "assistant" :content [{:type "output_text" :text "<think>I should say hello</think>\nhello there!"}]}
{:role "assistant" :content [{:type "output_text" :text "hello there!"}]}
{:role "user" :content [{:type "input_text" :text "how are you?"}]}]
:instructions (m/pred string?)}
(llm.mocks/get-req-body :reasoning-1)))))))
Expand Down
2 changes: 1 addition & 1 deletion integration-test/integration/chat/google_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@
(match-content chat-id "system" {:type "progress" :state "finished"})
(is (match?
{:input [{:role "user" :content [{:type "input_text" :text "hello!"}]}
{:role "assistant" :content [{:type "output_text" :text "<thought>I should say hello</thought>\nhello there!"}]}
{:role "assistant" :content [{:type "output_text" :text "hello there!"}]}
{:role "user" :content [{:type "input_text" :text "how are you?"}]}]
:instructions (m/pred string?)}
(llm.mocks/get-req-body :reasoning-1)))))))
Expand Down
2 changes: 2 additions & 0 deletions src/eca/llm_api.clj
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,7 @@
(let [url-relative-path (:completionUrlRelativePath provider-config)
think-tag-start (:thinkTagStart provider-config)
think-tag-end (:thinkTagEnd provider-config)
keep-history-reasoning (:keepHistoryReasoning model-config)
http-client (:httpClient provider-config)]
(handler
{:model real-model
Expand All @@ -221,6 +222,7 @@
:url-relative-path url-relative-path
:think-tag-start think-tag-start
:think-tag-end think-tag-end
:keep-history-reasoning keep-history-reasoning
:http-client http-client
:api-url api-url
:api-key api-key}
Expand Down
37 changes: 21 additions & 16 deletions src/eca/llm_providers/openai_chat.clj
Original file line number Diff line number Diff line change
Expand Up @@ -384,19 +384,24 @@
(reset! reasoning-state* {:id nil :type nil :content "" :buffer ""})))

(defn ^:private prune-history
"Ensure DeepSeek-style reasoning_content is discarded from history but kept for the active turn.
Only drops 'reason' messages WITH :delta-reasoning? before the last user message.
Think-tag based reasoning (without :delta-reasoning?) is preserved and transformed to assistant messages."
[messages]
(if-let [last-user-idx (llm-util/find-last-user-msg-idx messages)]
(->> messages
(keep-indexed (fn [i m]
(when-not (and (= "reason" (:role m))
(get-in m [:content :delta-reasoning?])
(< i last-user-idx))
m)))
vec)
messages))
"Discard reasoning messages from history.
Reasoning with :delta-reasoning? is preserved in the same turn (as required by Deepseek).
This corresponds to the implementation standard. However, it can be change it at the model level configuration.
Parameters:
- messages: the conversation history
- keep-history-reasoning: if true, preserve all reasoning in history"
[messages keep-history-reasoning]
(if keep-history-reasoning
messages
(if-let [last-user-idx (llm-util/find-last-user-msg-idx messages)]
(->> messages
(keep-indexed (fn [i m]
(when-not (and (= "reason" (:role m))
(or (< i last-user-idx)
(not (get-in m [:content :delta-reasoning?]))))
m)))
vec)
messages)))

(defn chat-completion!
"Primary entry point for OpenAI chat completions with streaming support.
Expand All @@ -406,14 +411,14 @@
Compatible with OpenRouter and other OpenAI-compatible providers."
[{:keys [model user-messages instructions temperature api-key api-url url-relative-path
past-messages tools extra-payload extra-headers supports-image?
think-tag-start think-tag-end http-client]}
think-tag-start think-tag-end keep-history-reasoning http-client]}
{:keys [on-message-received on-error on-prepare-tool-call on-tools-called on-reason on-usage-updated] :as callbacks}]
(let [think-tag-start (or think-tag-start "<think>")
think-tag-end (or think-tag-end "</think>")
stream? (boolean callbacks)
system-messages (when instructions [{:role "system" :content instructions}])
;; Pipeline: prune history -> normalize -> merge adjacent assistants -> filter
all-messages (prune-history (vec (concat past-messages user-messages)))
all-messages (prune-history (vec (concat past-messages user-messages)) keep-history-reasoning)
messages (vec (concat
system-messages
(normalize-messages all-messages supports-image? think-tag-start think-tag-end)))
Expand Down Expand Up @@ -473,7 +478,7 @@
tool-calls))
on-tools-called-wrapper (fn on-tools-called-wrapper [tools-to-call on-tools-called handle-response]
(when-let [{:keys [new-messages]} (on-tools-called tools-to-call)]
(let [pruned-messages (prune-history new-messages)
(let [pruned-messages (prune-history new-messages keep-history-reasoning)
new-messages-list (vec (concat
system-messages
(normalize-messages pruned-messages supports-image? think-tag-start think-tag-end)))
Expand Down
31 changes: 24 additions & 7 deletions test/eca/llm_providers/openai_chat_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@
{:role "assistant" :reasoning_content "Thinking..."}])))))

(deftest prune-history-test
(testing "Drops reason messages WITH :delta-reasoning? before the last user message (DeepSeek)"
(testing "Drops all reason messages before the last user message by default"
(is (match?
[{:role "user" :content "Q1"}
{:role "assistant" :content "A1"}
Expand All @@ -272,28 +272,45 @@
{:role "assistant" :content "A1"}
{:role "user" :content "Q2"}
{:role "reason" :content {:text "r2" :delta-reasoning? true}}
{:role "assistant" :content "A2"}]))))
{:role "assistant" :content "A2"}]
false))))

(testing "Preserves reason messages WITHOUT :delta-reasoning? (think-tag based)"
(testing "Preserves reason messages (without :delta-reasoning?) before last user message"
(is (match?
[{:role "user" :content "Q1"}
{:role "reason" :content {:text "thinking..."}}
{:role "assistant" :content "A1"}
{:role "user" :content "Q2"}
{:role "reason" :content {:text "more thinking..."}}
{:role "assistant" :content "A2"}]
(#'llm-providers.openai-chat/prune-history
[{:role "user" :content "Q1"}
{:role "reason" :content {:text "thinking..."}}
{:role "assistant" :content "A1"}
{:role "user" :content "Q2"}
{:role "reason" :content {:text "more thinking..."}}
{:role "assistant" :content "A2"}]))))
{:role "assistant" :content "A2"}]
false))))

(testing "Preserves all reasoning when keep-history-reasoning is true (Bedrock)"
(is (match?
[{:role "user" :content "Q1"}
{:role "reason" :content {:text "r1"}}
{:role "assistant" :content "A1"}
{:role "user" :content "Q2"}
{:role "reason" :content {:text "r2"}}
{:role "assistant" :content "A2"}]
(#'llm-providers.openai-chat/prune-history
[{:role "user" :content "Q1"}
{:role "reason" :content {:text "r1"}}
{:role "assistant" :content "A1"}
{:role "user" :content "Q2"}
{:role "reason" :content {:text "r2"}}
{:role "assistant" :content "A2"}]
true))))

(testing "No user message leaves list unchanged"
(let [msgs [{:role "assistant" :content "A"}
{:role "reason" :content {:text "r"}}]]
(is (= msgs (#'llm-providers.openai-chat/prune-history msgs))))))
(is (= msgs (#'llm-providers.openai-chat/prune-history msgs false))))))

(deftest valid-message-test
(testing "Tool messages are always kept"
Expand Down
Loading