Skip to content

OVMS GPU Memory Allocation Issue - Crashes on model load #4035

@wronglebowsk

Description

@wronglebowsk

Describe the bug
When attempting to load various models, OVMS has some type of runaway memory issue when attempting to use the GPU. For example attempting to load OpenVINO/Qwen3-Coder-30B-A3B-Instruct-int4-ov with the CPU flag works fine, CPU RAM usage is as expected and model functions correctly. Attempting to use the GPU the GPU RAM is not utilized, CPU RAM appears to fill instead until it far exceeds the model size and the system runs out of RAM crashing OVMS.

To Reproduce
Steps to reproduce the behavior:

  1. Use https://huggingface.co/OpenVINO/Qwen3-Coder-30B-A3B-Instruct-int4-ov
  2. .\ovms.exe --source_model OpenVINO/Qwen3-Coder-30B-A3B-Instruct-int4-ov --model_repository_path models --rest_port 8000 --task text_generation --target_device GPU --metrics_enable --log_level DEBUG
  3. Error: Exception from src\inference\src\dev\plugin.cpp:53:
    Check 'false' failed at src\plugins\intel_gpu\src\plugin\program_builder.cpp:163:
    [GPU] ProgramBuilder build failed!
    [CL ext] Can not allocate 402653184 bytes for USM Device. ptr: 0000000000000000, error: 0

Expected behavior
I would expect the model to be loaded into GPU memory and consume a parity level of memory as running on the CPU.

Logs
12900HK.txt

Configuration

  1. OVMS version: 2026
  2. OVMS config.json file: Default
  3. CPU, accelerator's versions if applicable: Attempting to run on the 12900HK iGPU
Image
  1. Model repository directory structure: Default from HF
  2. Model or publicly available similar model that reproduces the issue: https://huggingface.co/OpenVINO/Qwen3-Coder-30B-A3B-Instruct-int4-ov

Additional context
This is what it looks like in Task Manager

Image

Where as loading the model on the CPU uses a normal amount of RAM and performs as expected

Image

This also occurs with
OpenVINO/Qwen3-Coder-30B-A3B-Instruct-int4-ov
OpenVINO/gpt-oss-20b-int4-ov

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions