Skip to content

VLM chat aplication via python jinja#4235

Open
dkalinowski wants to merge 5 commits into
mainfrom
vlm-python-jinja
Open

VLM chat aplication via python jinja#4235
dkalinowski wants to merge 5 commits into
mainfrom
vlm-python-jinja

Conversation

@dkalinowski
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings May 22, 2026 11:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Visual Language Model (VLM) servables to support applying chat templates via a Python/Jinja processor (when Python support is enabled), including injecting <ov_genai_image_*> tags into the request JSON before template rendering, and adds exception handling around the C++ tokenizer chat-template path.

Changes:

  • Added RapidJSON-based rewriting of the request JSON to prepend image tags into messages[*].content before calling PyJinjaTemplateProcessor::applyChatTemplate (Python-enabled builds).
  • Wrapped tokenizer.apply_chat_template(...) in try/catch and improved error handling for invalid/missing chat templates (Python-disabled builds).
  • Added validation that the final prompt after template application is non-empty.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/llm/visual_language_model/legacy/servable.cpp Adds Python/Jinja chat-template path with image-tag injection and improves error handling for chat-template application.
src/llm/visual_language_model/continuous_batching/servable.cpp Mirrors the Python/Jinja chat-template path and image-tag injection for continuous batching, plus exception handling around tokenizer template application.
Comments suppressed due to low confidence (2)

src/llm/visual_language_model/legacy/servable.cpp:322

  • msg.HasMember("content") also asserts if the messages[chatTurnIndex] element is not an object. Guard with msg.IsObject() (or use msg.GetObject().FindMember) before HasMember to avoid crashes on unexpected request shapes.
                    if (chatTurnIndex < messages.Size()) {
                        auto& msg = messages[chatTurnIndex];
                        if (msg.HasMember("content") && msg["content"].IsString()) {
                            std::string newContent = imageTagString + msg["content"].GetString();
                            msg["content"].SetString(newContent.c_str(), newContent.length(), jsonDoc.GetAllocator());
                        }

src/llm/visual_language_model/continuous_batching/servable.cpp:126

  • msg.HasMember("content") asserts if messages[chatTurnIndex] is not an object. Guard with msg.IsObject() (or use msg.GetObject().FindMember) before HasMember to avoid crashes on unexpected request shapes.
                    if (chatTurnIndex < messages.Size()) {
                        auto& msg = messages[chatTurnIndex];
                        if (msg.HasMember("content") && msg["content"].IsString()) {
                            std::string newContent = imageTagString + msg["content"].GetString();
                            msg["content"].SetString(newContent.c_str(), newContent.length(), jsonDoc.GetAllocator());
                        }

Comment on lines +314 to +319
if (!jsonDoc.HasParseError() && jsonDoc.HasMember("messages") && jsonDoc["messages"].IsArray()) {
auto& messages = jsonDoc["messages"];
for (const auto& [chatTurnIndex, imageTagString] : imageTags) {
if (chatTurnIndex < messages.Size()) {
auto& msg = messages[chatTurnIndex];
if (msg.HasMember("content") && msg["content"].IsString()) {
Comment on lines +118 to +123
if (!jsonDoc.HasParseError() && jsonDoc.HasMember("messages") && jsonDoc["messages"].IsArray()) {
auto& messages = jsonDoc["messages"];
for (const auto& [chatTurnIndex, imageTagString] : imageTags) {
if (chatTurnIndex < messages.Size()) {
auto& msg = messages[chatTurnIndex];
if (msg.HasMember("content") && msg["content"].IsString()) {
Comment on lines +25 to +34
#pragma warning(push)
#pragma warning(disable : 4005 4309 6001 6385 6386 6326 6011 4005 4456 6246)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#include <rapidjson/document.h>
#include <rapidjson/stringbuffer.h>
#include <rapidjson/writer.h>
#pragma GCC diagnostic pop
#pragma warning(pop)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants