-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
Running Qwen3 with ollama backend I get
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "hf.co/unsloth/Qwen3-14B-GGUF:Q6_K_XL",
"messages": [{"role": "user", "content": "Hello!"}]
}'
{"id":"chatcmpl-184","object":"chat.completion","created":1747546347,"model":"hf.co/unsloth/Qwen3-14B-GGUF:Q6_K_XL","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"\u003cthink\u003e\nOkay, the user said \"Hello!\" so I need to respond appropriately. Let me check the guidelines. I should be friendly and offer help. Maybe ask how I can assist them today. Keep it simple and welcoming. Alright, let's go with that.\n\u003c/think\u003e\n\nHello! How can I assist you today? 😊"},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":68,"total_tokens":78}}which I fixed by making think tags configurable via env vars
THINK_TAG=\u003cthink\u003e
THINK_END_TAG=\u003c/think\u003e\n\n
Would you consider such a change? Maybe even "per model"?
bold84
Metadata
Metadata
Assignees
Labels
No labels