Skip to content

[BUG] GLM4.5 / Qwen3-Coder Tool Call Render Failure #392

@dinerburger

Description

@dinerburger

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Describe the bug

Sending tool calls down to GLM4.5 or Qwen3-Coder based model will cause a render failure, as their default chat templates expect tool call arguments to be Mappings, but they're currently being sent down as strings. This makes sense: the OAI standard specifies that arguments are currently sent and represented as a string.

Here's the "problem" line in the Qwen3-Coder template for example.

Transformers, and by extension SGLang, recommend converting this arguments field to a Mapping before kicking off the render. I expect the correct place to put this would be in endpoints/OAI/utils/chat_completion.py:243 or so. Happy to champion the feature, but I wanted to make sure we're in alignment on the proposed solution before I get working.

Reproduction steps

This can be most easily reproduced with the following curl command, specifying host, port and model name to match your configuration:

#!/bin/bash

curl http://localhost:19999/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -d '{
  "model": "Qwen3-Coder-30B-A3B-Instruct",
  "tools": [
    {
      "type": "function", 
      "function": {
        "name": "tool_calculate_post", 
        "description": "Calculates/evaluates the given expression.", 
        "parameters": {
          "type": "object", 
          "properties": {
            "expression": {
              "type": "string", 
              "title": "Expression", 
              "description": ""
            }
          },
          "required": ["expression"]
        }
      }
    }
  ], 
  "messages": [
    {
      "role": "user", 
      "content": "Use the calculator tool to compute `4 x 90` and provide a nice little answer block with the result."
    }, 
    {
      "role": "assistant",
      "content": "", 
      "tool_calls": [
        {
          "index": 0, 
          "id": "632619312", 
          "type": "function", 
          "function": {
            "name": "tool_calculate_post", 
            "arguments": "{\"expression\": \"4 x 90\"}"
          }
        }
      ]
    }, 
    {
      "role": "tool", 
      "tool_call_id": "632619312", 
      "content": "360"
    }
  ]
}'

Expected behavior

It probably shouldn't just barf and stop the completion anyway. Hopefully it'd use the tool call result properly.

Logs

2025-10-24 12:03:22.568 INFO:     Config file override detected in args.
2025-10-24 12:03:22.589 INFO:     The 'config.yml' file cannot be found
2025-10-24 12:03:22.596 INFO:     Using backend exllamav3
2025-10-24 12:03:22.597 INFO:     exllamav3 version: 0.0.11
2025-10-24 12:03:23.560 WARNING:  The provided model does not have vision capabilities that are supported by ExllamaV3. Vision input is disabled.
2025-10-24 12:03:23.561 WARNING:  Draft model is disabled because a model name wasn't provided. Please check your config.yml!
2025-10-24 12:03:23.562 INFO:     Attempting to load a prompt template if present.
2025-10-24 12:03:23.589 INFO:     Using template "chat_template" for chat completions.
2025-10-24 12:03:23.591 INFO:     Loading model: /mnt/blaststorage/models/textgen/Kwaipilot_KAT-Dev-EXL3
2025-10-24 12:03:23.592 INFO:     Loading with autosplit
2025-10-24 12:03:27.613 INFO:     Model successfully loaded.
Loading model modules ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 67/67 0:00:00
2025-10-24 12:03:27.616 WARNING:  Disabling authentication makes your instance vulnerable. Set the `disable_auth` flag to False in config.yml if you want to share 
this instance with others.
2025-10-24 12:03:27.617 INFO:     Generation logging is enabled for: prompts, generation params
2025-10-24 12:03:27.617 INFO:     Developer documentation: http://127.0.0.1:19999/redoc
2025-10-24 12:03:27.622 INFO:     Starting OAI API
2025-10-24 12:03:27.622 INFO:     Completions: http://127.0.0.1:19999/v1/completions
2025-10-24 12:03:27.622 INFO:     Chat completions: http://127.0.0.1:19999/v1/chat/completions
2025-10-24 12:03:27.653 INFO:     Started server process [3420055]
2025-10-24 12:03:27.654 INFO:     Waiting for application startup.
2025-10-24 12:03:27.655 INFO:     Application startup complete.
2025-10-24 12:03:27.656 INFO:     Uvicorn running on http://127.0.0.1:19999 (Press CTRL+C to quit)
2025-10-24 12:03:53.991 INFO:     Information for POST request 60208493d8594feb9facd94529f08b4e:
2025-10-24 12:03:53.991 INFO:     URL: http://localhost:19999/v1/chat/completions
2025-10-24 12:03:53.991 INFO:     Headers: {'host': 'localhost:19999', 'user-agent': 'curl/8.5.0', 'accept': '*/*', 'content-type': 'application/json', 
'authorization': 'Bearer ', 'content-length': '1116'}
2025-10-24 12:03:53.991 INFO:     Body: {'model': 'Kwaipilot-KAT-Dev', 'tools': [{'type': 'function', 'function': {'name': 'tool_calculate_post', 'description': 
'Calculates/evaluates the given expression.', 'parameters': {'type': 'object', 'properties': {'expression': {'type': 'string', 'title': 'Expression', 'description': 
''}}, 'required': ['expression']}}}], 'messages': [{'role': 'user', 'content': 'Use the calculator tool to compute `4 x 90` and provide a nice little answer block 
with the result.'}, {'role': 'assistant', 'content': '', 'tool_calls': [{'index': 0, 'id': '632619312', 'type': 'function', 'function': {'name': 
'tool_calculate_post', 'arguments': '{"expression": "4 x 90"}'}}]}, {'role': 'tool', 'tool_call_id': '632619312', 'content': 360}]}
2025-10-24 12:03:53.995 WARNING:  Unable to switch model to Kwaipilot-KAT-Dev because "inline_model_loading" is not True in config.yml.
2025-10-24 12:03:53.998 INFO:     127.0.0.1:45888 - "POST /v1/chat/completions HTTP/1.1" 500
2025-10-24 12:03:54.003 ERROR:    Exception in ASGI application
2025-10-24 12:03:54.003 ERROR:    Traceback (most recent call last):
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in 
run_asgi
2025-10-24 12:03:54.003 ERROR:        result = await app(  # type: ignore[func-returns-value]
2025-10-24 12:03:54.003 ERROR:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:            self.scope, self.receive, self.send
2025-10-24 12:03:54.003 ERROR:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:        )
2025-10-24 12:03:54.003 ERROR:        ^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2025-10-24 12:03:54.003 ERROR:        return await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/applications.py", line 1133, in __call__
2025-10-24 12:03:54.003 ERROR:        await super().__call__(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/applications.py", line 113, in __call__
2025-10-24 12:03:54.003 ERROR:        await self.middleware_stack(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 186, in __call__
2025-10-24 12:03:54.003 ERROR:        raise exc
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/errors.py", line 164, in __call__
2025-10-24 12:03:54.003 ERROR:        await self.app(scope, receive, _send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/cors.py", line 85, in __call__
2025-10-24 12:03:54.003 ERROR:        await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/middleware/exceptions.py", line 63, in __call__
2025-10-24 12:03:54.003 ERROR:        await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-10-24 12:03:54.003 ERROR:        raise exc
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2025-10-24 12:03:54.003 ERROR:        await app(scope, receive, sender)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
2025-10-24 12:03:54.003 ERROR:        await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 716, in __call__
2025-10-24 12:03:54.003 ERROR:        await self.middleware_stack(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 736, in app
2025-10-24 12:03:54.003 ERROR:        await route.handle(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/routing.py", line 290, in handle
2025-10-24 12:03:54.003 ERROR:        await self.app(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 123, in app
2025-10-24 12:03:54.003 ERROR:        await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
2025-10-24 12:03:54.003 ERROR:        raise exc
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
2025-10-24 12:03:54.003 ERROR:        await app(scope, receive, sender)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 109, in app
2025-10-24 12:03:54.003 ERROR:        response = await f(request)
2025-10-24 12:03:54.003 ERROR:                   ^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 389, in app
2025-10-24 12:03:54.003 ERROR:        raw_response = await run_endpoint_function(
2025-10-24 12:03:54.003 ERROR:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:        ...\<3 lines>...
2025-10-24 12:03:54.003 ERROR:        )
2025-10-24 12:03:54.003 ERROR:        ^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/fastapi/routing.py", line 288, in run_endpoint_function
2025-10-24 12:03:54.003 ERROR:        return await dependant.call(**values)
2025-10-24 12:03:54.003 ERROR:               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/router.py", line 126, in chat_completion_request
2025-10-24 12:03:54.003 ERROR:        prompt, embeddings = await apply_chat_template(data)
2025-10-24 12:03:54.003 ERROR:                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/utils/chat_completion.py", line 267, in apply_chat_template
2025-10-24 12:03:54.003 ERROR:        prompt, mm_embeddings, template_vars = await format_messages_with_template(
2025-10-24 12:03:54.003 ERROR:                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:            data.messages, data.template_vars
2025-10-24 12:03:54.003 ERROR:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:        )
2025-10-24 12:03:54.003 ERROR:        ^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/endpoints/OAI/utils/chat_completion.py", line 245, in format_messages_with_template
2025-10-24 12:03:54.003 ERROR:        prompt = await model.container.prompt_template.render(template_vars)
2025-10-24 12:03:54.003 ERROR:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/common/templating.py", line 92, in render
2025-10-24 12:03:54.003 ERROR:        rendered_template = await self.template.render_async(**template_vars)
2025-10-24 12:03:54.003 ERROR:                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/environment.py", line 1318, in render_async
2025-10-24 12:03:54.003 ERROR:        return self.environment.handle_exception()
2025-10-24 12:03:54.003 ERROR:               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/environment.py", line 942, in handle_exception
2025-10-24 12:03:54.003 ERROR:        raise rewrite_traceback_stack(source=source)
2025-10-24 12:03:54.003 ERROR:      File "<template>", line 88, in top-level template code
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/async_utils.py", line 82, in __anext__
2025-10-24 12:03:54.003 ERROR:        return next(self._iterator)
2025-10-24 12:03:54.003 ERROR:      File "/home/dinerburger/tabbyAPI-exl3/.venv/lib/python3.13/site-packages/jinja2/filters.py", line 249, in do_items
2025-10-24 12:03:54.003 ERROR:        raise TypeError("Can only get item pairs from a mapping.")
2025-10-24 12:03:54.003 ERROR:    TypeError: Can only get item pairs from a mapping.

Additional context

An easy end-run on this would be to add a fromjson Jinja filter, Ala this rejected Transformers PR. TabbyAPI already maintains a collection of its own Jinja prompts, so having custom glm4.5 and qwen3-coder that make use of fromjson isn't impossible, but keeping TabbyAPI in line with the larger Transformers ecosystem also sounds like a win.

Acknowledgements

  • I have looked for similar issues before submitting this one.
  • I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will ask my questions politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions