-
-
Notifications
You must be signed in to change notification settings - Fork 138
Description
OS
Linux
GPU Library
CUDA 12.x
Python version
3.12
Describe the bug
Sending tool calls down to GLM4.5 or Qwen3-Coder based model will cause a render failure, as their default chat templates expect tool call arguments to be Mappings, but they're currently being sent down as strings. This makes sense: the OAI standard specifies that arguments are currently sent and represented as a string.
Here's the "problem" line in the Qwen3-Coder template for example.
Transformers, and by extension SGLang, recommend converting this arguments field to a Mapping before kicking off the render. I expect the correct place to put this would be in endpoints/OAI/utils/chat_completion.py:243 or so. Happy to champion the feature, but I wanted to make sure we're in alignment on the proposed solution before I get working.
Reproduction steps
This can be most easily reproduced with the following curl command, specifying host, port and model name to match your configuration:
#!/bin/bash
curl http://localhost:19999/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $YOUR_API_KEY" \
-d '{
"model": "Qwen3-Coder-30B-A3B-Instruct",
"tools": [
{
"type": "function",
"function": {
"name": "tool_calculate_post",
"description": "Calculates/evaluates the given expression.",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"title": "Expression",
"description": ""
}
},
"required": ["expression"]
}
}
}
],
"messages": [
{
"role": "user",
"content": "Use the calculator tool to compute `4 x 90` and provide a nice little answer block with the result."
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "632619312",
"type": "function",
"function": {
"name": "tool_calculate_post",
"arguments": "{\"expression\": \"4 x 90\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "632619312",
"content": "360"
}
]
}'
Expected behavior
It probably shouldn't just barf and stop the completion anyway. Hopefully it'd use the tool call result properly.
Logs
2025-10-24 12:03:22.568 INFO: Config file override detected in args.
2025-10-24 12:03:22.589 INFO: The 'config.yml' file cannot be found
2025-10-24 12:03:22.596 INFO: Using backend exllamav3
2025-10-24 12:03:22.597 INFO: exllamav3 version: 0.0.11
2025-10-24 12:03:23.560 WARNING: The provided model does not have vision capabilities that are supported by ExllamaV3. Vision input is disabled.
2025-10-24 12:03:23.561 WARNING: Draft model is disabled because a model name wasn't provided. Please check your config.yml!
2025-10-24 12:03:23.562 INFO: Attempting to load a prompt template if present.
2025-10-24 12:03:23.589 INFO: Using template "chat_template" for chat completions.
2025-10-24 12:03:23.591 INFO: Loading model: /mnt/blaststorage/models/textgen/Kwaipilot_KAT-Dev-EXL3
2025-10-24 12:03:23.592 INFO: Loading with autosplit
2025-10-24 12:03:27.613 INFO: Model successfully loaded.
Loading model modules ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 67/67 0:00:00
2025-10-24 12:03:27.616 WARNING: Disabling authentication makes your instance vulnerable. Set the `disable_auth` flag to False in config.yml if you want to share
this instance with others.
2025-10-24 12:03:27.617 INFO: Generation logging is enabled for: prompts, generation params
2025-10-24 12:03:27.617 INFO: Developer documentation: http://127.0.0.1:19999/redoc
2025-10-24 12:03:27.622 INFO: Starting OAI API
2025-10-24 12:03:27.622 INFO: Completions: http://127.0.0.1:19999/v1/completions
2025-10-24 12:03:27.622 INFO: Chat completions: http://127.0.0.1:19999/v1/chat/completions
2025-10-24 12:03:27.653 INFO: Started server process [3420055]
2025-10-24 12:03:27.654 INFO: Waiting for application startup.
2025-10-24 12:03:27.655 INFO: Application startup complete.
2025-10-24 12:03:27.656 INFO: Uvicorn running on http://127.0.0.1:19999 (Press CTRL+C to quit)
2025-10-24 12:03:53.991 INFO: Information for POST request 60208493d8594feb9facd94529f08b4e:
2025-10-24 12:03:53.991 INFO: URL: http://localhost:19999/v1/chat/completions
2025-10-24 12:03:53.991 INFO: Headers: {'host': 'localhost:19999', 'user-agent': 'curl/8.5.0', 'accept': '*/*', 'content-type': 'application/json',
'authorization': 'Bearer ', 'content-length': '1116'}
2025-10-24 12:03:53.991 INFO: Body: {'model': 'Kwaipilot-KAT-Dev', 'tools': [{'type': 'function', 'function': {'name': 'tool_calculate_post', 'description':
'Calculates/evaluates the given expression.', 'parameters': {'type': 'object', 'properties': {'expression': {'type': 'string', 'title': 'Expression', 'description':
''}}, 'required': ['expression']}}}], 'messages': [{'role': 'user', 'content': 'Use the calculator tool to compute `4 x 90` and provide a nice little answer block
with the result.'}, {'role': 'assistant', 'content': '', 'tool_calls': [{'index': 0, 'id': '632619312', 'type': 'function', 'function': {'name':
'tool_calculate_post', 'arguments': '{"expression": "4 x 90"}'}}]}, {'role': 'tool', 'tool_call_id': '632619312', 'content': 360}]}
2025-10-24 12:03:53.995 WARNING: Unable to switch model to Kwaipilot-KAT-Dev because "inline_model_loading" is not True in config.yml.
2025-10-24 12:03:53.998 INFO: 127.0.0.1:45888 - "POST /v1/chat/completions HTTP/1.1" 500
2025-10-24 12:03:54.003 ERROR: Exception in ASGI application
2025-10-24 12:03:54.003 ERROR: Traceback (most recent call last):
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in
run_asgi
2025-10-24 12:03:54.003 ERROR: result = await app( # type: ignore[func-returns-value]
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: self.scope, self.receive, self.send
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: )
2025-10-24 12:03:54.003 ERROR: ^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2025-10-24 12:03:54.003 ERROR: return await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/applications.py", line 1133, in __call__
2025-10-24 12:03:54.003 ERROR: await super().__call__(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/applications.py", line 113, in __call__
2025-10-24 12:03:54.003 ERROR: await self.middleware_stack(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 186, in __call__
2025-10-24 12:03:54.003 ERROR: raise exc
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
2025-10-24 12:03:54.003 ERROR: await self.app(scope, receive, _send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/cors.py", line 85, in __call__
2025-10-24 12:03:54.003 ERROR: await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
2025-10-24 12:03:54.003 ERROR: await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-10-24 12:03:54.003 ERROR: raise exc
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2025-10-24 12:03:54.003 ERROR: await app(scope, receive, sender)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
2025-10-24 12:03:54.003 ERROR: await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 716, in __call__
2025-10-24 12:03:54.003 ERROR: await self.middleware_stack(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 736, in app
2025-10-24 12:03:54.003 ERROR: await route.handle(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 290, in handle
2025-10-24 12:03:54.003 ERROR: await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 123, in app
2025-10-24 12:03:54.003 ERROR: await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-10-24 12:03:54.003 ERROR: raise exc
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2025-10-24 12:03:54.003 ERROR: await app(scope, receive, sender)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 109, in app
2025-10-24 12:03:54.003 ERROR: response = await f(request)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 389, in app
2025-10-24 12:03:54.003 ERROR: raw_response = await run_endpoint_function(
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: ...\<3 lines>...
2025-10-24 12:03:54.003 ERROR: )
2025-10-24 12:03:54.003 ERROR: ^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 288, in run_endpoint_function
2025-10-24 12:03:54.003 ERROR: return await dependant.call(**values)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/router.py", line 126, in chat_completion_request
2025-10-24 12:03:54.003 ERROR: prompt, embeddings = await apply_chat_template(data)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/utils/chat_completion.py", line 267, in apply_chat_template
2025-10-24 12:03:54.003 ERROR: prompt, mm_embeddings, template_vars = await format_messages_with_template(
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: data.messages, data.template_vars
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: )
2025-10-24 12:03:54.003 ERROR: ^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/utils/chat_completion.py", line 245, in format_messages_with_template
2025-10-24 12:03:54.003 ERROR: prompt = await model.container.prompt_template.render(template_vars)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/common/templating.py", line 92, in render
2025-10-24 12:03:54.003 ERROR: rendered_template = await self.template.render_async(**template_vars)
2025-10-24 12:03:54.003 ERROR: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/environment.py", line 1318, in render_async
2025-10-24 12:03:54.003 ERROR: return self.environment.handle_exception()
2025-10-24 12:03:54.003 ERROR: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/environment.py", line 942, in handle_exception
2025-10-24 12:03:54.003 ERROR: raise rewrite_traceback_stack(source=source)
2025-10-24 12:03:54.003 ERROR: File "<template>", line 88, in top-level template code
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/async_utils.py", line 82, in __anext__
2025-10-24 12:03:54.003 ERROR: return next(self._iterator)
2025-10-24 12:03:54.003 ERROR: File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/filters.py", line 249, in do_items
2025-10-24 12:03:54.003 ERROR: raise TypeError("Can only get item pairs from a mapping.")
2025-10-24 12:03:54.003 ERROR: TypeError: Can only get item pairs from a mapping.
Additional context
An easy end-run on this would be to add a fromjson Jinja filter, Ala this rejected Transformers PR. TabbyAPI already maintains a collection of its own Jinja prompts, so having custom glm4.5 and qwen3-coder that make use of fromjson isn't impossible, but keeping TabbyAPI in line with the larger Transformers ecosystem also sounds like a win.
Acknowledgements
- I have looked for similar issues before submitting this one.
- I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
- I understand that the developers have lives and my issue will be answered when possible.
- I understand the developers of this program are human, and I will ask my questions politely.