Skip to content

Conversation

@zikajk
Copy link
Member

@zikajk zikajk commented Jan 15, 2026

Discard reasoning before last user message (based on OpenAI SDK).
Model-level config enables preservation (e.g. GLM-4.7 with preserved thinking).

Changes:

  • prune-history: add keep-history-reasoning parameter

  • Tests: cover both pruning and preservation

  • Docs: add keepHistoryReasoning to model schema

  • I added a entry in changelog under unreleased section.

@ericdallo Subjectively, the behavior seems better now. Since it's difficult to evaluate, I went through the OpenAI SDK with Opencode via Opus and it found that they discard it as well. And I couldn't find any documentation that would suggest we should do otherwise by default.

Standard behavior: discard reasoning before last user message (based on OpenAI SDK).
Model-level config enables preservation (e.g. GLM-4.7 with preserved thinking).

Changes:
- prune-history: add keep-history-reasoning parameter
- Tests: cover both pruning and preservation
- Docs: add keepHistoryReasoning to model schema
@zikajk zikajk force-pushed the configurable-reasoning-history branch from a94e839 to 1be0ff3 Compare January 15, 2026 15:30

## Unreleased

- (OpenAI Chat) - Configurable reasoning history via `keepHistoryReasoning` (model-level, default: prune)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this default for all openai-chat is a good idea or if we should have it enabled only for google provider in config.clj, also shouldn't this be a flag by provider and not model?

Copy link
Member Author

@zikajk zikajk Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on what I've found in Opencode and OpenAI SDK it is default for every model.

  1. OpenCode transform <think> block to message of type reasoning and this type is ignored by OpenAI SDK and never reach the LLM.
  2. Delta reasoning messages are usually persisted in one turn and then thrown away (as you can see in GLM docs bellow, first vs second diagram)

It shouldn't be by provider because you might want to configure it only for e.g. GLM-4.7 (second diagram here ->
https://docs.z.ai/guides/capabilities/thinking-mode).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's keep an eye on that if affects any openai-chat model, I still think it's a little bit dangerous move, but if openai sdk is really doing that it may be ok

@zikajk
Copy link
Member Author

zikajk commented Jan 15, 2026

@ericdallo If you approve I will merge it tommorow after fixing integration tests.

docs/models.md Outdated

=== "History reasoning"

`keepHistoryReasoning` preserves reasoning in conversation history. Set for specific models:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we elaborate that some models require that, most people won't understand what this means and why one may need to use it or not

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if there is any model that requires it by default.
I mean:

  1. no model requires reasoning to be returned in <think> tags (but user can experiment and see if there are models that profits from it).
  2. It looks like GLM4.7 with "clear_thinking": False might profit from it while some other models can just ignore it. For example Deepseek:

When the next user question begins (Turn 2.1), the previous reasoning_content should be removed, while keeping other elements to send to the API. If reasoning_content is retained and sent to the API, the API will ignore it.

In Opencode you can't even configure this. So I would say that people might care about this only in very special scenarios where they already understand what this is about.

TLDR. I wouldn't say that any models require that, but maybe there is a room how to improve the description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, require this as false, imagine a user checking docs and seeing that flag, it's not clear enough why one would like to set it to true

Copy link
Member Author

@zikajk zikajk Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it is not hard requirement. You can experiment it with.
<think> blocks returned as assistant message will influence model behavior and I can imagine that there might exist scenarios where it is worth to experiment with it.
While reasoning_content might be dropped or not.

I am trying to come up with a nice short sentence that can explain it better than it is right now.
What about this?

Persists the model's internal reasoning chain within the chat context for future turns. Set for specific model configuration if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, I think it's important to mention something like: "Some models don't work when reason is sent back"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are those models though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add there a link to GLM config and explain the motivation behind false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well gemini right? they won't work if it's true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in the case of Gemini, it was specifically about XML thinking. So there is probably no model that would not support it. But maybe this change will help it break less (or not at all). It's hard to say, it's relatively rare and evaluating it is not free, hehe.

@zikajk zikajk changed the title add configurable reasoning preservation for openai completions Add configurable reasoning preservation for openai completions Jan 17, 2026
@zikajk zikajk changed the title Add configurable reasoning preservation for openai completions [DRAFT] Add configurable reasoning preservation for openai completions Jan 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants