[AI] GPU ORT install: manifest-driven UI button, scripts, and EP support by andriiryzhkov · Pull Request #20676 · darktable-org/darktable

andriiryzhkov · 2026-03-26T13:03:31Z

Follow-up to #20647 (ORT library path preference and custom loading). Addresses #20532.

Summary

Manifest-driven system for installing GPU-accelerated ONNX Runtime directly from darktable or via scripts.

Package manifest (`data/ort_gpu.json`)

All download URLs, SHA-256 checksums, archive formats, and ROCm version mappings in a single JSON file – update it to change URLs/versions without rebuilding darktable. Supports NVIDIA (CUDA), AMD (MIGraphX), and Intel (OpenVINO) on Linux and Windows.

Install button in AI preferences (`preferences_ai.c`)

Detects GPUs (NVIDIA via nvidia-smi, AMD via rocminfo, Intel via lspci)
Selection dialog if multiple GPU vendors found
Shows requirements and missing dependencies with distro-specific install hints
Downloads with progress bar, verifies SHA-256 checksum
Extracts and validates the library, auto-fills the ORT path preference
Guarded by HAVE_AI_DOWNLOAD (requires curl + libarchive)

Backend (`ort_install.c/h`)

GPU detection for NVIDIA, AMD, Intel with dependency checks
Manifest loading via json-glib, package matching by vendor + platform + ROCm version
Download via curl, SHA-256 verification, extraction via libarchive (tgz/zip/whl)
Library validation via dt_ai_ort_probe_library

Unified install scripts

Script	Platform	Requires
`install-ort-gpu.sh`	Linux	jq
`install-ort-gpu.ps1`	Windows	PowerShell
`install-ort-amd-build.sh`	Linux	cmake, gcc, python3

Scripts read the same ort_gpu.json manifest, detect GPU, download and verify the matching package. Support --vendor, --force, --manifest flags.

EP availability expanded (`backend_common.c`)

CUDA and OpenVINO now selectable on Windows (previously Linux-only).

EP	Linux	Windows	macOS
CPU	yes	yes	yes
CoreML	–	–	yes (bundled)
CUDA (NVIDIA)	yes	yes	–
MIGraphX (AMD)	yes	–	–
OpenVINO (Intel)	yes	yes	–
DirectML	–	yes (bundled)	–

Documentation

tools/ai/README.md – install methods, EP table, manifest format, manual install, verification.

Test plan

andriiryzhkov · 2026-04-02T09:51:44Z

@TurboGit : I need help testing this scripts and UI installer. Can you test it with NVIDIA card on Linux?

And maybe you can advise who can help with other tests?

TurboGit · 2026-04-02T16:46:57Z

@andriiryzhkov : Just to be sure, you mean install-ort-gpu.sh, right?

I'll try to do that tomorrow.

andriiryzhkov · 2026-04-02T18:08:33Z

@TurboGit :

install-ort-gpu.sh is a first part.
The second part (basically the same result) is "install" button added in AI preferences tab.

TurboGit · 2026-04-03T15:17:48Z

@andriiryzhkov : Using the script was ok. I was also able to install with the in UI [install] button. I saw a difference, when using the script I had not softlink created whereas with the in UI installation I got:

libonnxruntime.so -> libonnxruntime.so.1
libonnxruntime.so.1 -> libonnxruntime.so.1.24.4

andriiryzhkov · 2026-04-03T17:12:04Z

@TurboGit : Thank you for testing.

The symlinks are created by the UI install for extra safety, but not strictly required — darktable loads the library by full path from the preferences config, not via LD_LIBRARY_PATH or soname resolution. The script works fine without them since it provides the full path directly. I can add symlink creation to the script too for consistency if you prefer.

Looks like NVIDIA and AMD are tested.

TurboGit · 2026-04-03T17:24:08Z

I can add symlink creation to the script too for consistency if you prefer.

No not needed, just wanted to be sure this was expected and not the a cause of another issue.

andriiryzhkov · 2026-04-08T11:48:17Z

Fixed cuDNN detection in UI installation on Ubuntu and Debian

da-phil · 2026-04-14T11:02:34Z

Some feedback after I tried to install the ONNX runtime for ROCm 7.2, / MIGraphX:

❯ ./install-ort-gpu.sh --vendor amd --manifest ort_gpu.jsononnx
Vendor override: amd (skipping GPU detection)

ONNX Runtime 1.23.2 - GPU acceleration installer
============================================================

GPU: AMD GPU
ORT version: 1.23.2
Download size: ~300 MB
Install to: /home/phil/.local/lib/onnxruntime-migraphx
Requirements: ROCm 7.2, MIGraphX

Continue? [y/N] y

Downloading...
/tmp/tmp.V9P1P4uMDj/ort-package              100%[===========================================================================================>]  19,55M  10,7MB/s    in 1,8s    
Verifying checksum...
Checksum OK.
Extracting...

Done. Installed to: /home/phil/.local/lib/onnxruntime-migraphx
-rwxr-xr-x 1 phil phil 560K Apr 14 12:35 /home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime_providers_migraphx.so
-rwxr-xr-x 1 phil phil  16K Apr 14 12:35 /home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime_providers_shared.so
-rwxr-xr-x 1 phil phil  27M Apr 14 12:35 /home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2

To enable in darktable:

  1. Open darktable preferences (Ctrl+,)
  2. Go to the AI tab
  3. Click 'detect' to find the installed library automatically,
     or set 'ONNX Runtime library' to:
     /home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2
  4. Restart darktable

Or via command line:

  DT_ORT_LIBRARY=/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2 darktable

Now this is what I get when I start darktable with ai debug msgs enabled and click on the restore module:

❯ DT_ORT_LIBRARY=/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2 darktable -d ai
darktable 5.5.0+965~g44a4e93598-dirty
Copyright (C) 2012-2026 Johannes Hanika and other contributors.

Compile options:
  Bit depth              -> 64 bit
  Exiv2                  -> 0.27.6
  Lensfun                -> 0.3.4
  Debug                  -> DISABLED
  SSE2 optimizations     -> ENABLED
  OpenMP                 -> ENABLED
  OpenCL                 -> ENABLED
  Lua                    -> ENABLED  - API version 9.6.0
  Colord                 -> DISABLED
  gPhoto2                -> ENABLED  - Camera tethering is available
  OSMGpsMap              -> DISABLED - Map view is NOT available
  GMIC                   -> ENABLED  - Compressed LUTs are supported
  GraphicsMagick         -> ENABLED
  ImageMagick            -> DISABLED
  libavif                -> ENABLED
  libheif                -> ENABLED
  libjxl                 -> ENABLED
  LibRaw                 -> ENABLED  - Version 0.22.0-Release
  OpenJPEG               -> ENABLED
  OpenEXR                -> ENABLED
  WebP                   -> ENABLED
  AI                     -> ENABLED

See https://www.darktable.org/resources/ for detailed documentation.
See https://github.com/darktable-org/darktable/issues/new/choose to report bugs.

[dt starting] as : darktable -d ai
     0,3188 [ai_models] initialized: models_dir=/home/phil/.local/share/darktable/models, cache_dir=/home/phil/.cache/darktable/ai_downloads
     0,3191 [ai_models] using repository: darktable-org/darktable-ai
     0,3191 [ai_models] registered model: mask sam2.1 hiera small (mask-object-sam21-small)
     0,3191 [ai_models] registered model: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq)
     0,3191 [ai_models] registered model: denoise nind (denoise-nind)
     0,3191 [ai_models] registered model: upscale bsrgan (upscale-bsrgan)
     0,3191 [ai_models] registry loaded: 4 models from /opt/darktable/share/darktable/ai_models.json
     2.8004 [darktable_ai] dt_ai_env_init start.
     2.8005 [darktable_ai] discovered: upscale bsrgan (upscale-bsrgan, backend=onnx)
     2.8005 [darktable_ai] discovered: mask sam2.1 hiera small (mask-object-sam21-small, backend=onnx)
     2.8005 [darktable_ai] discovered: denoise nind (denoise-nind, backend=onnx)
     2.8006 [darktable_ai] discovered: mask segnext vitb-sax2 hq (mask-object-segnext-b2hq, backend=onnx)
    20.7755 [neural_restore] preview: exported 9568x6376, scale=1, export_size=0
    20.7986 [darktable_ai] loaded ORT 1.23.2 from '/home/phil/.local/lib/onnxruntime-migraphx/libonnxruntime.so.1.23.2'
The requested API version [24] is not available, only API versions [1, 23] are supported in this build. Current ORT Version is: 1.23.2
    20.7987 [darktable_ai] ORT 1.23.2: using API version 23 (compiled for 24)
    20.7987 [darktable_ai] execution provider: MIGraphX
    20.7993 [darktable_ai] MIOpen cache: /home/phil/.cache/darktable/ai/amd/miopen
    20.7993 [darktable_ai] MIGraphX cache: /home/phil/.cache/darktable/ai/amd/migraphx
    20.8158 [darktable_ai] loading: /home/phil/.local/share/darktable/models/denoise-nind/model.onnx
    20.8160 [darktable_ai] attempting to enable AMD MIGraphX...
    20.9144 [darktable_ai] AMD MIGraphX enabled successfully.
2026-04-14 12:50:36.136523057 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:167 MIGraphXExecutionProvider] [MIGraphX EP] MIGraphX ENV Override Variables Set:
2026-04-14 12:50:36.431876117 [W:onnxruntime:DarktableAI, migraphx_execution_provider.cc:1309 compile_program] Model Compile: Begin

It just gets stuck with Model Compile: Begin, while my GPU gets very busy and hot.

I also attached the output of rocminfo: rocminfo.txt.
And here are all installed rocm/migraph pkgs: rocm_migraph_pkgs.txt

I assigned 10 GB of pageable RAM to the iGPU, which should be sufficient, shouldn't it?

❯ amd-ttm
💻 Current TTM pages limit: 2621440 pages (10.00 GB)
💻 Total system memory: 30.64 GB

It is certainly not hitting any memory bottleneck as per the amd-smi overview:

❯ amd-smi
+------------------------------------------------------------------------------+
| AMD-SMI          26.4.0+478a7a43c6                                           |
| OS kernel Version:  6.18.20-061820-generic                                      |
| ROCm Version:    7.13.0                                                      |
| VBIOS Version:   00077464                                                    |
| Platform:        Linux Baremetal                                             |
|-------------------------------------+----------------------------------------|
| BDF                        GPU-Name | Mem-Uti   Temp   UEC       Power-Usage |
| GPU  HIP-ID  OAM-ID  Partition-Mode | GFX-Uti    Fan               Mem-Usage |
|=====================================+========================================|
| 0000:65:00.0 ...adeon 780M Graphics | N/A        N/A   0                 N/A |
|   0       0     N/A             N/A | N/A        N/A           3128/10240 MB |
+-------------------------------------+----------------------------------------+
+------------------------------------------------------------------------------+
| Processes:                                                                   |
|  GPU      PID  Process Name       GTT_MEM  VRAM_MEM  MEM_USAGE  CU %  SDMA   |
|==============================================================================|
|    0   646843  darktable           1.3 GB    5.5 MB     1.3 GB    N/A   0 us |
+------------------------------------------------------------------------------+

Maybe this is just a very specific AMD / iGPU issue, as my laptop is running on a AMD Ryzen 7 8845HS with an Radeon 780M iGPU.

andriiryzhkov · 2026-04-14T11:27:26Z

@da-phil :

It just gets stuck with Model Compile: Begin, while my GPU gets very busy and hot.

You are not stuck, let it finish compilation. It may take long, veeerrrryyyyy long. But it's only one time per model and per tile size. I tested on my Radeon 780M iGPU and it took 30 minutes. The compilation is cached on disk. The following runs with the same tile size where very fast - just like 10 seconds or so. Much faster than CPU.

iGPUs generally are not supported by ROCm and they don't have pre-build kernels. That's why I think it takes so long. But compilation is a thing even for proper supported GPUs.

da-phil · 2026-04-14T12:51:19Z

@da-phil :

It just gets stuck with Model Compile: Begin, while my GPU gets very busy and hot.

You are not stuck, let it finish compilation. It may take long, veeerrrryyyyy long. But it's only one time per model and per tile size. I tested on my Radeon 780M iGPU and it took 30 minutes. The compilation is cached on disk. The following runs with the same tile size where very fast - just like 10 seconds or so. Much faster than CPU.

iGPUs generally are not supported by ROCm and they don't have pre-build kernels. That's why I think it takes so long. But compilation is a thing even for proper supported GPUs.

Okay, this time I was patient enough and voila, it worked!
Processing a 9568x6376 image on the iGPU was significantly faster than on the CPU, by a factor of 3.2x (55.29s VS 178.33s)!

On CPU:

   224.1143 [neural_restore] job started: task=denoise, scale=1, images=1
   224.1163 [neural_restore] processing imgid 63525 -> /home/phil/Pictures/some_image.tif
   229.3276 [neural_restore] processing 9568x6376 -> 9568x6376 (scale=1)
   229.3280 [restore] tiling 9568x6376 (scale=1) -> 9568x6376, 7x5 grid (35 tiles, T=1536)
   402.4445 [neural_restore] imported imgid=63529: /home/phil/Pictures/some_image.tif

On GPU:

  1980.9891 [neural_restore] job started: task=denoise, scale=1, images=1
  1980.9913 [neural_restore] processing imgid 63525 -> /home/phil/Pictures/some_image.tif
  1984.5201 [neural_restore] processing 9568x6376 -> 9568x6376 (scale=1)
  1984.5211 [restore] tiling 9568x6376 (scale=1) -> 9568x6376, 7x5 grid (35 tiles, T=1536)
  2036.2812 [neural_restore] imported imgid=63530: /home/phil/Pictures/some_image.tif

But now the problem: the GPU path leads to significant vertical stripe artifacts:

Is this a known issue or is there anything I can do to debug it? The CPU path leads to artifact-free denoised images.

andriiryzhkov · 2026-04-14T12:54:37Z

But now the problem: the GPU path leads to significant vertical stripe artifacts

That is interesting. I will check, but assume it would be hard to trace

andriiryzhkov force-pushed the ort_scripts branch 2 times, most recently from 2fc7d22 to bf749a7 Compare April 2, 2026 08:26

andriiryzhkov changed the title ~~[AI] GPU-accelerated ORT install scripts and documentation~~ [AI] GPU ORT install: manifest-driven UI button, scripts, and EP support Apr 2, 2026

andriiryzhkov mentioned this pull request Apr 2, 2026

[AI] Add neural restore lighttable module for AI denoise and upscale #20523

Merged

andriiryzhkov force-pushed the ort_scripts branch 2 times, most recently from 3d4266e to 4544aee Compare April 8, 2026 11:44

andriiryzhkov added 6 commits April 11, 2026 15:40

Add GPU ORT install: manifest-driven UI button, scripts, and EP support

ff88f32

Fix build for ort_install.c

b6ac26c

Fix Expand-Archive failure on Windows by adding file extension

a8db5c4

Add CUDA 13 support with version-based package matching

6bde1b3

Fix cuDNN detection on Debian/Ubuntu

b1be06b

Fix ROCm/MIGraphX EP detection

7e38626

andriiryzhkov force-pushed the ort_scripts branch from 131c743 to 7e38626 Compare April 11, 2026 13:40

andriiryzhkov added 2 commits April 11, 2026 18:34

Fix cuDNN detection on Windows

03c4cba

Fix GPU ORT installer on Windows

64815e6

andriiryzhkov force-pushed the ort_scripts branch from 7810328 to 64815e6 Compare April 11, 2026 16:38

andriiryzhkov added 5 commits April 11, 2026 21:54

Bundle cuDNN and CUDA runtime DLLs alongside ORT on Windows

e5766d8

Bundle OpenVINO runtime DLLs alongside ORT on Windows

da8f09f

Fix OpenVINO runtime version to match ORT 1.24.1 (2025.4.1)

f2820a7

Detect CUDA toolkit version without nvcc in ORT installer

390a554

Bundle CUDA and OpenVINO runtime DLLs in UI installer on Windows

791dee6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AI] GPU ORT install: manifest-driven UI button, scripts, and EP support#20676

[AI] GPU ORT install: manifest-driven UI button, scripts, and EP support#20676
andriiryzhkov wants to merge 13 commits intodarktable-org:masterfrom
andriiryzhkov:ort_scripts

andriiryzhkov commented Mar 26, 2026 •

edited

Loading

Uh oh!

andriiryzhkov commented Apr 2, 2026

Uh oh!

TurboGit commented Apr 2, 2026

Uh oh!

andriiryzhkov commented Apr 2, 2026

Uh oh!

TurboGit commented Apr 3, 2026

Uh oh!

andriiryzhkov commented Apr 3, 2026

Uh oh!

TurboGit commented Apr 3, 2026

Uh oh!

andriiryzhkov commented Apr 8, 2026

Uh oh!

da-phil commented Apr 14, 2026 •

edited

Loading

Uh oh!

andriiryzhkov commented Apr 14, 2026

Uh oh!

da-phil commented Apr 14, 2026

Uh oh!

andriiryzhkov commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andriiryzhkov commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Package manifest (data/ort_gpu.json)

Install button in AI preferences (preferences_ai.c)

Backend (ort_install.c/h)

Unified install scripts

EP availability expanded (backend_common.c)

Documentation

Test plan

Uh oh!

andriiryzhkov commented Apr 2, 2026

Uh oh!

TurboGit commented Apr 2, 2026

Uh oh!

andriiryzhkov commented Apr 2, 2026

Uh oh!

TurboGit commented Apr 3, 2026

Uh oh!

andriiryzhkov commented Apr 3, 2026

Uh oh!

TurboGit commented Apr 3, 2026

Uh oh!

andriiryzhkov commented Apr 8, 2026

Uh oh!

da-phil commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andriiryzhkov commented Apr 14, 2026

Uh oh!

da-phil commented Apr 14, 2026

Uh oh!

andriiryzhkov commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andriiryzhkov commented Mar 26, 2026 •

edited

Loading

Package manifest (`data/ort_gpu.json`)

Install button in AI preferences (`preferences_ai.c`)

Backend (`ort_install.c/h`)

EP availability expanded (`backend_common.c`)

da-phil commented Apr 14, 2026 •

edited

Loading