[ET-VK][CI] Add test-vulkan-genai job for Parakeet on NVIDIA GPU runner by SS-JIA · Pull Request #18335 · pytorch/executorch

SS-JIA · 2026-03-19T19:15:12Z

Stack from ghstack (oldest at bottom):

-> [ET-VK][CI] Add test-vulkan-genai job for Parakeet on NVIDIA GPU runner #18335

Add a new GitHub CI job that exports and runs the Parakeet TDT model with
the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner
code already exists but had no CI coverage.

Add --gpu flag to setup-vulkan-linux-deps.sh to skip SwiftShader
installation when running on machines with a real GPU driver
Add vulkan as a supported device in export_model_artifact.sh and
test_model_e2e.sh
Add test-vulkan-genai job to pull.yml on linux.g5.4xlarge.nvidia.gpu

Differential Revision: D97344728

cc @manuelcandales @digantdesai @cbilgin

Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) [ghstack-poisoned]

pytorch-bot · 2026-03-19T19:15:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18335

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 208 Pending

As of commit affcc89 with merge base fb1618e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) ghstack-source-id: 354758467 Pull Request resolved: #18335

github-actions · 2026-03-19T19:16:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…IA GPU runner" Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) cc manuelcandales digantdesai cbilgin [ghstack-poisoned]

Pull Request resolved: #18335 Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` ghstack-source-id: 354902046 @exported-using-ghexport Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/)

…IA GPU runner" Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) cc manuelcandales digantdesai cbilgin [ghstack-poisoned]

Pull Request resolved: #18335 Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` - Fix Parakeet CMakeLists.txt to guard `quantized_ops_lib` and `custom_ops` with `if(TARGET ...)` and IMPORTED checks. When the Parakeet runner is built as a standalone CMake project (the second step of `make parakeet-vulkan`), these targets are found via `find_package(executorch)` as imported targets. The existing code called `executorch_target_link_options_shared_lib()` on them unconditionally, which internally calls `target_link_options()` — and CMake does not allow `target_link_options()` on imported targets. Other targets in the same file (e.g. `optimized_native_cpu_ops_lib`, `xnnpack_backend`) already had the correct guards; `quantized_ops_lib` and `custom_ops` were simply missing them. ghstack-source-id: 355090034 @exported-using-ghexport Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/)

…IA GPU runner" Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) cc manuelcandales digantdesai cbilgin [ghstack-poisoned]

Pull Request resolved: #18335 Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` - Fix Parakeet CMakeLists.txt to guard `quantized_ops_lib` and `custom_ops` with `if(TARGET ...)` and IMPORTED checks. When the Parakeet runner is built as a standalone CMake project (the second step of `make parakeet-vulkan`), these targets are found via `find_package(executorch)` as imported targets. The existing code called `executorch_target_link_options_shared_lib()` on them unconditionally, which internally calls `target_link_options()` — and CMake does not allow `target_link_options()` on imported targets. Other targets in the same file (e.g. `optimized_native_cpu_ops_lib`, `xnnpack_backend`) already had the correct guards; `quantized_ops_lib` and `custom_ops` were simply missing them. ghstack-source-id: 358903222 @exported-using-ghexport Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/)

SS-JIA · 2026-03-27T22:23:49Z

@claude investigate the test-vulkan-genai job failure

claude · 2026-03-27T22:24:34Z

Claude finished @SS-JIA's task in 3m 58s —— View job

Investigating `test-vulkan-genai` Job Failure

Get CI status and identify the failing workflow run
Download and analyze job logs
Review relevant code changes
Report findings

Root Cause

The build fails at examples/models/parakeet/timestamp_utils.h:10 with:

fatal error: 'pytorch/tokenizers/tokenizer.h' file not found

This happens during the make parakeet-vulkan step inside test_model_e2e.sh.

Analysis

The CI job script (.github/workflows/pull.yml:1356-1375) runs two build stages:

setup-linux.sh — Configures and builds ExecuTorch into cmake-out/ using raw cmake with only -DEXECUTORCH_BUILD_VULKAN=ON. This creates a CMake cache where most extensions are OFF, including:
- EXECUTORCH_BUILD_EXTENSION_LLM=OFF
- EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=OFF
- EXECUTORCH_BUILD_EXTENSION_MODULE=OFF
- EXECUTORCH_BUILD_EXTENSION_TENSOR=OFF
- etc.
make parakeet-vulkan (called from test_model_e2e.sh) — Runs cmake --workflow --preset llm-debug-vulkan, which uses the llm.cmake preset file. This preset attempts to enable EXECUTORCH_BUILD_EXTENSION_LLM=ON (among other things) via set_overridable_option. However, set_overridable_option (tools/cmake/common/preset.cmake:98-106) checks if(NOT DEFINED ${NAME}) — since the CMake cache in cmake-out/ already has these variables defined (set to OFF from step 1), the preset cannot override them.

As a result, the tokenizers library (built under EXECUTORCH_BUILD_EXTENSION_LLM) is never compiled or installed. When the parakeet runner tries to #include <pytorch/tokenizers/tokenizer.h>, the header doesn't exist.

The CI log confirms this at the configure output (log line ~23618):

EXECUTORCH_BUILD_EXTENSION_LLM              : OFF
EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER       : OFF

The Fix

The test-vulkan-genai CI job needs the setup-linux.sh step to also build with the LLM extensions enabled, or the cmake-out directory should be cleaned between the two build stages so the llm-debug-vulkan preset can configure from scratch.

Option A (Recommended): Skip the setup-linux.sh step entirely and let make parakeet-vulkan handle the full build. The parakeet-vulkan Makefile target already calls cmake --workflow --preset llm-debug-vulkan (which builds everything needed) before building the parakeet runner. The setup-linux.sh step is redundant (and harmful due to cache conflicts) when using the Makefile workflow.

The CI script should look like:

# Setup Vulkan SDK (no SwiftShader — use NVIDIA driver's Vulkan ICD)
source .ci/scripts/setup-vulkan-linux-deps.sh --gpu

# Install ExecuTorch Python package (for export step)
# ... (just the pip install, not the cmake build)

# Export parakeet with Vulkan backend
bash .ci/scripts/export_model_artifact.sh vulkan "nvidia/parakeet-tdt" "quantized-8da4w" "${RUNNER_ARTIFACT_DIR}"

# Build runner and test e2e (make parakeet-vulkan handles the full C++ build)
bash .ci/scripts/test_model_e2e.sh vulkan "nvidia/parakeet-tdt" "quantized-8da4w" "${RUNNER_ARTIFACT_DIR}"

Option B: Add additional CMAKE_ARGS to the setup-linux.sh call:

CMAKE_ARGS="-DEXECUTORCH_BUILD_VULKAN=ON -DEXECUTORCH_BUILD_EXTENSION_LLM=ON -DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON"

Option C: Clean cmake-out between the two build stages so the preset can configure cleanly:

rm -rf cmake-out
make parakeet-vulkan

Note: The existing test-parakeet-xnnpack-linux job likely works because the XNNPACK preset path doesn't have this same cache conflict, or the required extensions happen to be enabled by default.

SS-JIA · 2026-04-06T18:17:45Z

@claude investigate the test-vulkan-genai job failure

claude · 2026-04-06T18:18:27Z

Claude finished @SS-JIA's task in 3m 14s —— View job

Investigating `test-vulkan-genai` Job Failure

Get CI status and identify failing runs
Download and analyze job logs
Review relevant code and CI configuration
Report findings with root cause and fix recommendations

Root Cause

The build fails at examples/models/parakeet/timestamp_utils.h:10 with:

fatal error: 'pytorch/tokenizers/tokenizer.h' file not found

This happens during the make parakeet-vulkan step called from test_model_e2e.sh.

Detailed Analysis

The CI job script (.github/workflows/pull.yml:1356-1375) runs these build stages in sequence:

install_executorch.sh (called inside setup-linux.sh) — Runs pip install which internally uses the pybind preset to configure cmake-out/. This step actually has EXECUTORCH_BUILD_EXTENSION_LLM=ON (log line ~4866). It then cleans cmake-out/ and moves to step 2.

build_executorch_runner_cmake (also inside setup-linux.sh) — Cleans cmake-out/, then runs:

cmake -DEXECUTORCH_BUILD_VULKAN=ON -Bcmake-out .

This creates a CMake cache where most extensions are OFF by default (log line ~19751):

EXECUTORCH_BUILD_EXTENSION_LLM              : OFF
EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER       : OFF
EXECUTORCH_BUILD_EXTENSION_MODULE           : OFF
EXECUTORCH_BUILD_EXTENSION_TENSOR           : OFF

make parakeet-vulkan → first calls cmake --workflow --preset llm-debug-vulkan. This preset uses tools/cmake/preset/llm.cmake which calls set_overridable_option(EXECUTORCH_BUILD_EXTENSION_LLM ON). However, set_overridable_option (tools/cmake/common/preset.cmake:98-106) checks if(NOT DEFINED ${NAME}) — since the CMake cache in cmake-out/ already has these variables defined (set to OFF from step 2), the preset cannot override them. The log confirms at line ~23618:
```
EXECUTORCH_BUILD_EXTENSION_LLM              : OFF
EXECUTORCH_BUILD_EXTENSION_LLM_RUNNER       : OFF
```
Parakeet runner build → cd examples/models/parakeet && cmake --workflow --preset parakeet-vulkan — tries to build the parakeet runner but the tokenizers library was never built (it requires EXECUTORCH_BUILD_EXTENSION_LLM=ON), so the header pytorch/tokenizers/tokenizer.h doesn't exist in the install prefix.

Why `test-parakeet-xnnpack-linux` works

The XNNPACK CI job (.github/workflows/pull.yml:167-198) does not call setup-linux.sh. Instead it only runs ./install_executorch.sh (Python-only install), then goes straight to test_model_e2e.sh → make parakeet-cpu → cmake --workflow --preset llm-release which configures cmake-out/ from scratch with no conflicting cache — so set_overridable_option can set EXECUTORCH_BUILD_EXTENSION_LLM=ON.

Recommended Fix

Option A (Recommended — matches the working XNNPACK pattern): Replace the setup-linux.sh call with just ./install_executorch.sh:

script: |
  set -eux
  CONDA_ENV=$(conda env list --json | jq -r ".envs | .[-1]")
  conda activate "${CONDA_ENV}"

  # Setup Vulkan SDK (no SwiftShader — use NVIDIA driver's Vulkan ICD)
  source .ci/scripts/setup-vulkan-linux-deps.sh --gpu

  # Install ExecuTorch Python package only (no C++ build — the Makefile
  # workflow preset handles that)
  ./install_executorch.sh

  # Export parakeet with Vulkan backend
  bash .ci/scripts/export_model_artifact.sh vulkan "nvidia/parakeet-tdt" "quantized-8da4w" "${RUNNER_ARTIFACT_DIR}"

  # Build runner and test e2e (make parakeet-vulkan does the full C++ build)
  bash .ci/scripts/test_model_e2e.sh vulkan "nvidia/parakeet-tdt" "quantized-8da4w" "${RUNNER_ARTIFACT_DIR}"

The setup-linux.sh step is redundant (and harmful due to cache conflicts) because make parakeet-vulkan already does a full CMake workflow configure+build via the llm-debug-vulkan preset.

Option B: Clean cmake-out/ between the two build stages:

rm -rf cmake-out
make parakeet-vulkan

Option C: Pass all the required flags to setup-linux.sh:

CMAKE_ARGS="-DEXECUTORCH_BUILD_VULKAN=ON -DEXECUTORCH_BUILD_EXTENSION_LLM=ON -DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=ON -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON -DEXECUTORCH_BUILD_EXTENSION_FLAT_TENSOR=ON" \
.ci/scripts/setup-linux.sh --build-tool "cmake"

Option A is cleanest since it matches the proven pattern from test-parakeet-xnnpack-linux.

…IA GPU runner" Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) cc manuelcandales digantdesai cbilgin [ghstack-poisoned]

Pull Request resolved: #18335 Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` - Fix Parakeet CMakeLists.txt to guard `quantized_ops_lib` and `custom_ops` with `if(TARGET ...)` and IMPORTED checks. When the Parakeet runner is built as a standalone CMake project (the second step of `make parakeet-vulkan`), these targets are found via `find_package(executorch)` as imported targets. The existing code called `executorch_target_link_options_shared_lib()` on them unconditionally, which internally calls `target_link_options()` — and CMake does not allow `target_link_options()` on imported targets. Other targets in the same file (e.g. `optimized_native_cpu_ops_lib`, `xnnpack_backend`) already had the correct guards; `quantized_ops_lib` and `custom_ops` were simply missing them. ghstack-source-id: 363162811 @exported-using-ghexport Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/)

SS-JIA · 2026-04-06T19:49:50Z

@claude investigate the test-vulkan-genai job failure

claude · 2026-04-06T19:50:37Z

Claude finished @SS-JIA's task in 6m 3s —— View job

Investigating `test-vulkan-genai` Job Failure

Get CI status and identify the failing workflow run
Review relevant code changes since last investigation
Analyze whether the previous fix addresses the failure
Report findings

Status

The current CI run (pull #24047249156) was triggered at 19:27 UTC today and is still in progress. The test-vulkan-genai job requires a linux.g5.4xlarge.nvidia.gpu runner and hasn't been provisioned yet (the job is beyond the API's 30-job pagination window). There are no completed failures in the current run.

Code Changes Since Last Investigation

The latest commit (b83c0a2) has applied a fix for the root cause identified in my previous investigation. Specifically, .github/workflows/pull.yml:1372-1377 now includes:

# Clean cmake-out/ — install_executorch.sh creates a CMake cache with
# most extensions OFF, and set_overridable_option in the Makefile
# presets cannot override cached variables. Without this, the
# llm-debug-vulkan preset fails to enable EXECUTORCH_BUILD_EXTENSION_LLM
# and the tokenizers header is never installed.
rm -rf cmake-out

This directly addresses the root cause: install_executorch.sh runs pip install . --no-build-isolation which internally uses CMake with the pybind preset, creating a cmake-out/ directory with EXECUTORCH_BUILD_EXTENSION_LLM=OFF cached. Subsequently, make parakeet-vulkan calls cmake --workflow --preset llm-debug-vulkan, but set_overridable_option in tools/cmake/common/preset.cmake:98-106 refuses to override already-defined cache variables. The rm -rf cmake-out step clears the stale cache so the preset can configure cleanly.

The CI job also no longer calls setup-linux.sh (which ran an unnecessary C++ build), and instead only runs ./install_executorch.sh (Python package install), matching the working pattern from test-parakeet-xnnpack-linux.

Assessment

The fix looks correct. The rm -rf cmake-out between install_executorch.sh and test_model_e2e.sh should resolve the 'pytorch/tokenizers/tokenizer.h' file not found build error. This is essentially Option C from the previous analysis, combined with Option A (dropping setup-linux.sh in favor of install_executorch.sh).

The job needs to run on the GPU runner to confirm the fix works end-to-end. Monitor the current CI run at pull #24047249156 — the test-vulkan-genai job should appear once a GPU runner is provisioned.

SS-JIA · 2026-04-06T20:05:42Z

@claude investigate the test-vulkan-genai job failure. You should see that the job failed due to being unable to initialize the global Vulkan runtime

claude · 2026-04-06T20:06:39Z

Claude finished @SS-JIA's task in 5m 33s —— View job

Investigating `test-vulkan-genai` Job Failure

Get CI status and identify the failing workflow run
Download and analyze job logs
Review relevant code and CI configuration
Report findings with root cause and fix recommendations

Root Cause

The parakeet_runner crashes at startup with:

terminate called after throwing an instance of 'vkcompute::vkapi::Error'
  what():  Exception raised from runtime at backends/vulkan/runtime/vk_api/Runtime.cpp:407:
    (p_runtime) is false! Pytorch Vulkan Runtime: The global runtime could not be retrieved
    because it failed to initialize.

The Vulkan ICD loader library (libvulkan.so.1) is missing from the Docker container.

Detailed Analysis

The Vulkan backend uses volk to dynamically load Vulkan at runtime (see backends/vulkan/CMakeLists.txt:54 — USE_VULKAN_VOLK is defined). When volkInitialize() is called, it does dlopen("libvulkan.so.1"). If this library isn't found, it returns failure, and init_global_vulkan_runtime() at Runtime.cpp:265 returns nullptr.

Why the library is missing:

The Docker image (executorch-ubuntu-22.04-clang12) does not include libvulkan1. Checking the Dockerfile and install_base.sh confirms no Vulkan packages are installed (libvulkan1 only appears in install_arm.sh for the ARM docker image).
The setup-vulkan-linux-deps.sh --gpu script (.ci/scripts/setup-vulkan-linux-deps.sh:48-58) skips SwiftShader when --gpu is passed. SwiftShader would normally provide libvulkan.so via its LD_LIBRARY_PATH export (line 25). Without SwiftShader, only the Vulkan SDK is installed — which adds bin/ tools to PATH but does not provide the loader library.

The NVIDIA driver installer on the host even warns about this at log line 1282:

WARNING: This NVIDIA driver package includes Vulkan components, but no Vulkan ICD
loader was detected on this system. The NVIDIA Vulkan ICD will not function without
the loader. Most distributions package the Vulkan loader; try installing the
"vulkan-loader", "vulkan-icd-loader", or "libvulkan1" package.

The NVIDIA driver installs the Vulkan ICD driver (nvidia_icd.json + libGLX_nvidia.so), but libvulkan.so.1 (the ICD loader that dispatches to drivers) must be installed separately.

Although the Docker container is launched with --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all (which makes CUDA and the NVIDIA GPU visible), libvulkan.so.1 is not part of the NVIDIA Container Toolkit's automatic library injection — it must be explicitly installed in the container.

Why other Vulkan CI jobs work: The test-vulkan-operators-linux and test-vulkan-models-linux jobs run on linux.2xlarge (no GPU) and use SwiftShader, which bundles its own libvulkan.so and sets LD_LIBRARY_PATH.

Recommended Fix

Option A (Recommended — minimal change): Install libvulkan1 in the --gpu path of setup-vulkan-linux-deps.sh:

if [ "$USE_GPU" = false ]; then
  install_swiftshader
else
  # The Vulkan ICD loader (libvulkan.so.1) is needed to dispatch to the
  # NVIDIA ICD driver. SwiftShader bundles its own, but on GPU runners
  # we need to install it separately.
  apt-get update -q && apt-get install -y --no-install-recommends libvulkan1
fi

This installs the ~100KB Vulkan loader package which provides the libvulkan.so.1 that volk's volkInitialize() needs.

Option B (Docker image change): Add libvulkan1 to the base Docker image in .ci/docker/common/install_base.sh or the Dockerfile. This is cleaner but requires a docker image rebuild.

Option C (Use Vulkan SDK's loader): The LunarG Vulkan SDK includes libvulkan.so in its x86_64/lib/ directory. Add the SDK lib path to LD_LIBRARY_PATH:

install_vulkan_sdk() {
  # ... existing code ...
  export PATH="${PATH}:${_vulkan_sdk_dir}/${VULKAN_SDK_VERSION}/x86_64/bin/"
  export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:-}:${_vulkan_sdk_dir}/${VULKAN_SDK_VERSION}/x86_64/lib/"
}

This avoids needing apt-get (which may require sudo inside the container) and is self-contained.

Option A or C would fix the immediate issue. Option C is cleanest since it uses what's already downloaded.

…IA GPU runner" Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/) cc manuelcandales digantdesai cbilgin [ghstack-poisoned]

Pull Request resolved: #18335 Add a new GitHub CI job that exports and runs the Parakeet TDT model with the Vulkan backend on an NVIDIA GPU runner. The Vulkan export and runner code already exists but had no CI coverage. - Add `--gpu` flag to `setup-vulkan-linux-deps.sh` to skip SwiftShader installation when running on machines with a real GPU driver - Add `vulkan` as a supported device in `export_model_artifact.sh` and `test_model_e2e.sh` - Add `test-vulkan-genai` job to `pull.yml` on `linux.g5.4xlarge.nvidia.gpu` - Fix Parakeet CMakeLists.txt to guard `quantized_ops_lib` and `custom_ops` with `if(TARGET ...)` and IMPORTED checks. When the Parakeet runner is built as a standalone CMake project (the second step of `make parakeet-vulkan`), these targets are found via `find_package(executorch)` as imported targets. The existing code called `executorch_target_link_options_shared_lib()` on them unconditionally, which internally calls `target_link_options()` — and CMake does not allow `target_link_options()` on imported targets. Other targets in the same file (e.g. `optimized_native_cpu_ops_lib`, `xnnpack_backend`) already had the correct guards; `quantized_ops_lib` and `custom_ops` were simply missing them. ghstack-source-id: 363200555 @exported-using-ghexport Differential Revision: [D97344728](https://our.internmc.facebook.com/intern/diff/D97344728/)

This was referenced Mar 19, 2026

[ET-VK][conv1d] Implement height-packed pointwise conv1d operator #18332

Merged

[ET-VK][conv1d] Implement height-packed depthwise conv1d operator #18333

Merged

[ET-VK][conv1d] Route conv1d to height-packed implementations in export pipeline #18334

Merged

pytorch-bot bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Mar 19, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 19, 2026

meta-codesync bot added fb-exported meta-exported labels Mar 19, 2026

SS-JIA temporarily deployed to upload-benchmark-results March 19, 2026 20:09 — with GitHub Actions Inactive

SS-JIA temporarily deployed to upload-benchmark-results March 19, 2026 23:53 — with GitHub Actions Inactive

SS-JIA requested review from kirklandsign and larryliu0820 as code owners March 20, 2026 06:20

SS-JIA temporarily deployed to upload-benchmark-results March 20, 2026 07:22 — with GitHub Actions Inactive

SS-JIA temporarily deployed to upload-benchmark-results March 27, 2026 19:23 — with GitHub Actions Inactive

trviv approved these changes Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][CI] Add test-vulkan-genai job for Parakeet on NVIDIA GPU runner#18335

[ET-VK][CI] Add test-vulkan-genai job for Parakeet on NVIDIA GPU runner#18335
SS-JIA wants to merge 6 commits intogh/SS-JIA/497/basefrom
gh/SS-JIA/497/head

SS-JIA commented Mar 19, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 19, 2026

Uh oh!

SS-JIA commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SS-JIA commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18335

⏳ No Failures, 208 Pending

Uh oh!

github-actions bot commented Mar 19, 2026

This PR needs a release notes: label

Uh oh!

SS-JIA commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigating test-vulkan-genai Job Failure

Root Cause

Analysis

The Fix

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigating test-vulkan-genai Job Failure

Root Cause

Detailed Analysis

Why test-parakeet-xnnpack-linux works

Recommended Fix

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigating test-vulkan-genai Job Failure

Status

Code Changes Since Last Investigation

Assessment

Uh oh!

SS-JIA commented Apr 6, 2026

Uh oh!

claude bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigating test-vulkan-genai Job Failure

Root Cause

Detailed Analysis

Recommended Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SS-JIA commented Mar 19, 2026 •

edited

Loading

pytorch-bot bot commented Mar 19, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude bot commented Mar 27, 2026 •

edited

Loading

Investigating `test-vulkan-genai` Job Failure

claude bot commented Apr 6, 2026 •

edited

Loading

Investigating `test-vulkan-genai` Job Failure

Why `test-parakeet-xnnpack-linux` works

claude bot commented Apr 6, 2026 •

edited

Loading

Investigating `test-vulkan-genai` Job Failure

claude bot commented Apr 6, 2026 •

edited

Loading

Investigating `test-vulkan-genai` Job Failure