Skip to content

Fix model in embeddings demo#4045

Open
michalkulakowski wants to merge 1 commit intomainfrom
mkulakow/fix_embedding_demo
Open

Fix model in embeddings demo#4045
michalkulakowski wants to merge 1 commit intomainfrom
mkulakow/fix_embedding_demo

Conversation

@michalkulakowski
Copy link
Collaborator

🛠 Summary

JIRA/Issue if applicable.
Describe the changes.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copilot AI review requested due to automatic review settings March 6, 2026 08:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the embeddings demo documentation to use the correct Qwen3 embeddings model variant for NPU usage.

Changes:

  • Switches the Qwen3 NPU tab sync/model reference from fp16 to int8.
  • Adds a docker run example for serving the Qwen3 int8 OpenVINO model via OpenVINO Model Server.

:sync: Qwen3-Embedding-0.6B-fp16
:sync: Qwen3-Embedding-0.6B-int8
```console
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker run --pull requires an explicit value (e.g., --pull=always|missing|never) in supported Docker versions; using --pull without a value will fail. Update the command to provide a value (or remove the flag if not needed).

Suggested change
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull=always --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings

Copilot uses AI. Check for mistakes.
:sync: Qwen3-Embedding-0.6B-fp16
:sync: Qwen3-Embedding-0.6B-int8
```console
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the :latest tag makes the documentation non-reproducible (behavior can change over time). Prefer pinning to a specific, known-good image version/tag to keep the demo stable.

Suggested change
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:2024.0 --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings

Copilot uses AI. Check for mistakes.
:sync: Qwen3-Embedding-0.6B-fp16
:sync: Qwen3-Embedding-0.6B-int8
```console
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bind mount uses -v $(pwd)/models:... without quoting; if the working directory path contains spaces, the command will break. Quote the host path (or use an absolute path variable) to make the example more robust.

Suggested change
docker run --user $(id -u):$(id -g) --rm -v $(pwd)/models:/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings
docker run --user $(id -u):$(id -g) --rm -v "$(pwd)/models":/models:rw openvino/model_server:latest --pull --model_repository_path /models --source_model OpenVINO/Qwen3-Embedding-0.6B-int8-ov --pooling LAST --task embeddings

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants