Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions demos/c_api_minimal_app/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ BASE_OS ?= ubuntu24

ifeq ($(BASE_OS),ubuntu24)
BASE_OS_TAG_UBUNTU ?= 24.04
PACKAGE_URL ?="https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_ubuntu24_2026.1.0_python_off.tar.gz"
PACKAGE_URL ?="https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_ubuntu24_2026.2.0_python_off.tar.gz"
BASE_IMAGE ?= ubuntu:$(BASE_OS_TAG_UBUNTU)
DIST_OS=ubuntu
endif
ifeq ($(BASE_OS),redhat)
BASE_OS_TAG_REDHAT ?= 9.6
PACKAGE_URL ="https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_redhat_2026.1.0_python_off.tar.gz"
PACKAGE_URL ="https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_redhat_2026.2.0_python_off.tar.gz"
BASE_IMAGE ?= registry.access.redhat.com/ubi9/ubi:$(BASE_OS_TAG_REDHAT)
DIST_OS=redhat
endif
Expand Down
1 change: 1 addition & 0 deletions demos/continuous_batching/agentic_ai/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mcp_weather_server
70 changes: 70 additions & 0 deletions demos/continuous_batching/agentic_ai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,28 @@ Exemplary output:
The current weather in Tokyo is Overcast with a temperature of 9.4°C (feels like 6.4°C), relative humidity at 42%, and dew point at -2.9°C. The wind is blowing from the northeast at 3.6 km/h with gusts up to 24.8 km/h. The atmospheric pressure is 1018.9 hPa with 84% cloud cover. Visibility is 24.1 km.
```
:::
:::{tab-item} Qwen3.6-35B-A3B
:sync: Qwen3.6-35B-A3B
Vision Language MoE model (35B total / 3B active parameters). Requires OpenVINO 2026.2 or newer and a GPU with sufficient memory to fit the INT4 weights. Tested on PantherLake iGPU with 32GB RAM with iGPU allocation increase and B70 dGPU.

Pull and start OVMS:
```bat
ovms.exe --rest_port 8000 --source_model OpenVINO/Qwen3.6-35B-A3B-int4-ov --model_repository_path c:\models --reasoning_parser qwen3 --tool_parser qwen3coder --target_device GPU --task text_generation --cache_dir .cache --allowed_media_domains raw.githubusercontent.com
```

Use MCP server, with additional image of Gdańsk old town. VLM model deduces location and calls `get_weather` tool to summarize the weather conditions in the city.

```{image} https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg
:alt: poland
:width: 360px
```

> **Note**: Image source: [Link](https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg)

```bat
python openai_agent.py --query "What is the current weather in location depicted in the image?" --image https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2026/1/demos/continuous_batching/agentic_ai/photo.jpeg --model OpenVINO/Qwen3.6-35B-A3B-int4-ov --base-url http://localhost:8000/v3 --mcp-server-url http://localhost:8080/sse --mcp-server weather
```
:::
:::{tab-item} gpt-oss-20b
:sync: gpt-oss-20b
Pull and start OVMS:
Expand Down Expand Up @@ -283,6 +305,30 @@ Exemplary output:
The current weather in Tokyo is overcast with a temperature of 9.4°C (feels like 6.4°C). The relative humidity is 42%, and the dew point is -2.9°C. Wind is blowing from the northeast at 3.6 km/h, with gusts up to 24.8 km/h. The atmospheric pressure is 1018.9 hPa, and there is 84% cloud cover. Visibility is 24.1 km.
```
:::
:::{tab-item} Qwen3.6-35B-A3B
:sync: Qwen3.6-35B-A3B
Vision Language MoE model (35B total / 3B active parameters). Requires OpenVINO 2026.2 or newer and enough host memory to fit the INT4 weights. Tested on PantherLake iGPU with 32GB RAM with iGPU allocation increase and B70 dGPU.

Pull and start OVMS:
```bash
mkdir -p ${HOME}/models
docker run -d --user $(id -u):$(id -g) --rm -p 8000:8000 -v ${HOME}/models:/models openvino/model_server:weekly \
--rest_port 8000 --source_model OpenVINO/Qwen3.6-35B-A3B-int4-ov --model_repository_path /models --reasoning_parser qwen3 --tool_parser qwen3coder --task text_generation --allowed_media_domains raw.githubusercontent.com
```

Use MCP server, with additional image of Gdańsk old town. VLM model deduces location and calls `get_weather` tool to summarize the weather conditions in the city.

```{image} https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg
:alt: poland
:width: 360px
```

> **Note**: Image source: [Link](https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg)

```bash
python openai_agent.py --query "What is the current weather in location depicted in the image?" --image https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2026/1/demos/continuous_batching/agentic_ai/photo.jpeg --model OpenVINO/Qwen3.6-35B-A3B-int4-ov --base-url http://localhost:8000/v3 --mcp-server-url http://localhost:8080/sse --mcp-server weather
```
:::
:::{tab-item} gpt-oss-20b
:sync: gpt-oss-20b
Pull and start OVMS:
Expand Down Expand Up @@ -408,6 +454,30 @@ Exemplary output:
The current weather in Tokyo is overcast with a temperature of 9.4°C (feels like 6.4°C). The relative humidity is 42%, and the dew point is -2.9°C. Wind is blowing from the northeast at 3.6 km/h, with gusts up to 24.8 km/h. The atmospheric pressure is 1018.9 hPa, and there is 84% cloud cover. Visibility is 24.1 km.
```
:::
:::{tab-item} Qwen3.6-35B-A3B
:sync: Qwen3.6-35B-A3B
Vision Language MoE model (35B total / 3B active parameters). Requires OpenVINO 2026.2 or newer and a GPU with sufficient memory to fit the INT4 weights. Tested on PantherLake iGPU with 32GB RAM with iGPU allocation increase and B70 dGPU.

Pull and start OVMS:
```bash
mkdir -p ${HOME}/models
docker run -d --user $(id -u):$(id -g) --rm -p 8000:8000 -v ${HOME}/models:/models --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) openvino/model_server:weekly \
--rest_port 8000 --source_model OpenVINO/Qwen3.6-35B-A3B-int4-ov --model_repository_path /models --reasoning_parser qwen3 --tool_parser qwen3coder --target_device GPU --task text_generation --allowed_media_domains raw.githubusercontent.com
```

Use MCP server, with additional image of Gdańsk old town. VLM model deduces location and calls `get_weather` tool to summarize the weather conditions in the city.

```{image} https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg
:alt: poland
:width: 360px
```

> **Note**: Image source: [Link](https://images.pexels.com/photos/20015887/pexels-photo-20015887.jpeg)

```bash
python openai_agent.py --query "What is the current weather in location depicted in the image?" --image https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2026/1/demos/continuous_batching/agentic_ai/photo.jpeg --model OpenVINO/Qwen3.6-35B-A3B-int4-ov --base-url http://localhost:8000/v3 --mcp-server-url http://localhost:8080/sse --mcp-server weather
```
:::
:::{tab-item} gpt-oss-20b
:sync: gpt-oss-20b
Pull and start OVMS:
Expand Down
28 changes: 14 additions & 14 deletions docs/deploying_server_baremetal.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ You can download model server package in two configurations. One with Python sup
:sync: ubuntu-22-04
Download precompiled package (without python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_ubuntu22_2026.1.0_python_off.tar.gz
tar -xzvf ovms_ubuntu22_2026.1.0_python_off.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_ubuntu22_2026.2.0_python_off.tar.gz
tar -xzvf ovms_ubuntu22_2026.2.0_python_off.tar.gz
```
or precompiled package (with python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_ubuntu22_2026.1.0_python_on.tar.gz
tar -xzvf ovms_ubuntu22_2026.1.0_python_on.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_ubuntu22_2026.2.0_python_on.tar.gz
tar -xzvf ovms_ubuntu22_2026.2.0_python_on.tar.gz
```
Install required libraries:
```{code} sh
Expand Down Expand Up @@ -50,13 +50,13 @@ Model server version with Python is shipped with those packages and new installa
:sync: ubuntu-24-04
Download precompiled package (without python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_ubuntu24_2026.1.0_python_off.tar.gz
tar -xzvf ovms_ubuntu24_2026.1.0_python_off.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_ubuntu24_2026.2.0_python_off.tar.gz
tar -xzvf ovms_ubuntu24_2026.2.0_python_off.tar.gz
```
or precompiled package (with python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_ubuntu24_2026.1.0_python_on.tar.gz
tar -xzvf ovms_ubuntu24_2026.1.0_python_on.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_ubuntu24_2026.2.0_python_on.tar.gz
tar -xzvf ovms_ubuntu24_2026.2.0_python_on.tar.gz
```
Install required libraries:
```{code} sh
Expand Down Expand Up @@ -85,13 +85,13 @@ Model server version with Python is shipped with those packages and new installa
:sync: rhel-9.6
Download precompiled package (without python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_redhat_2026.1.0_python_off.tar.gz
tar -xzvf ovms_redhat_2026.1.0_python_off.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_redhat_2026.2.0_python_off.tar.gz
tar -xzvf ovms_redhat_2026.2.0_python_off.tar.gz
```
or precompiled package (with python):
```{code} sh
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_redhat_2026.1.0_python_on.tar.gz
tar -xzvf ovms_redhat_2026.1.0_python_on.tar.gz
wget https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_redhat_2026.2.0_python_on.tar.gz
tar -xzvf ovms_redhat_2026.2.0_python_on.tar.gz
```
Install required libraries:
```{code} sh
Expand Down Expand Up @@ -124,14 +124,14 @@ Make sure you have [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/r
Download and unpack model server archive for Windows(with python):

```bat
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_windows_2026.1.0_python_on.zip -o ovms.zip
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_windows_2026.2.0_python_on.zip -o ovms.zip
tar -xf ovms.zip
```

or archive without python:

```bat
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_windows_2026.1.0_python_off.zip -o ovms.zip
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_windows_2026.2.0_python_off.zip -o ovms.zip
tar -xf ovms.zip
```

Expand Down
2 changes: 1 addition & 1 deletion docs/pull_optimum_cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ mkdir models
## Add optimum-cli to OVMS installation on windows

```bat
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.1/ovms_windows_2026.1.0_python_on.zip -o ovms.zip
curl -L https://github.com/openvinotoolkit/model_server/releases/download/v2026.2/ovms_windows_2026.2.0_python_on.zip -o ovms.zip
tar -xf ovms.zip
ovms\setupvars.bat
ovms\python\python -m pip install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/main/demos/common/export_models/requirements.txt
Expand Down