Skip to content

Conversation

@dhiltgen
Copy link

@dhiltgen dhiltgen commented Jan 7, 2026

Proposed changes

This change partially addresses windows build issues in MLX. The remaining errors are related to add_kernel_node which may require a deeper change to keep MSVC+NVCC happy on windows.

To repro, on a windows system, install:

I've been building with:

$env:CUDNN_INCLUDE_PATH="C:\Program Files\NVIDIA\CUDNN\v9.15\include\13.0"
$env:CUDNN_LIBRARY_PATH="C:\Program Files\NVIDIA\CUDNN\v9.15\lib\13.0\x64"

cmake -B build -DMLX_BUILD_CUDA=on -DMLX_BUILD_GGUF=OFF -DBUILD_SHARED_LIBS=ON -DMLX_BUILD_PYTHON_STUBS=OFF -DMLX_BUILD_METAL=OFF  -DMLX_BUILD_TESTS=OFF .
cmake --build build --parallel 2>&1 | % ToString | Tee-Object build.log

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

@awni awni requested a review from zcbenz January 10, 2026 14:52
@awni
Copy link
Member

awni commented Jan 10, 2026

@zcbenz just FYI @dhiltgen is part of the Ollama team they are hoping to add support for Windows with MLX Cuda back-end. If I understand correctly, there is one function that doesn't compile with MSVC / nvcc on windows still (add_kernel_node). Would be great if you could help take this over the finish line when you are back.

@zcbenz
Copy link
Collaborator

zcbenz commented Jan 18, 2026

@dhiltgen Do you have recommendations for Windows laptops for CUDA development? I usually just do Windows dev in a virtual machine but I guess I have to get Apple buy me a gaming laptop now.

This change partially addresses windows build issues in MLX.  The remaining errors are
related to `add_kernel_node` which may require a deeper change to keep MSVC+NVCC
happy on windows.
This commit completes the Windows port by resolving all MSVC+NVCC
compatibility issues with CUDA kernels. Key changes include:

- Add MLX_EXPORT macros for DLL symbol visibility on Windows
- Refactor kernel instantiation to work around NVCC template limitations
- Add explicit kernel instantiations for all type combinations
- Fix cuFFT integration for Windows builds
- Update CMake configuration for Windows-specific build requirements
- Add GPU test infrastructure for Windows validation
@dhiltgen
Copy link
Author

@zcbenz a PC might be a better choice so you can get a GPU with a bit more VRAM.

I've also added a second commit that does a deeper pass at getting things working. With that commit, I'm now able to get the tests passing on CUDA, and mlx-lm successfully ran a model on the GPU.

@zcbenz
Copy link
Collaborator

zcbenz commented Jan 20, 2026

Thanks for the update, it is really exciting that you have got CUDA backend running on Windows!

Can you separate some of the changes into independent PRs? Especially the MLX_EXPORT related changes and the backend/cpu changes. It is something that we can merge soon and probably needs some specialized discussions.

For the changes on backend/cuda, we want to keep things as it is if possible (i.e. have add_kernel_node deduce parameters automatically, and keep our nested dispatch_xxx utils), and I understand it is probably impossible with MSVC, but I do need to give it a try myself and it might take some time. And if we have to change the way we dispatch kernels, we would need to discuss about it first. But anyway having a working version makes things much easier for us!

@zcbenz zcbenz mentioned this pull request Jan 20, 2026
@dhiltgen
Copy link
Author

dhiltgen commented Jan 20, 2026

I'll tease out the non-cuda related changes and rebase on your CI PR changes.

I was trying to find a simpler approach to get the kernels registering properly, but was struggling to keep MSVC happy. Hopefully you can find a more elegant solution.

@dhiltgen
Copy link
Author

I'll rebase this PR on #3024 for consistency, however if you come up with a cleaner solution to the CUDA kernel registration we can ultimately close this PR.

@zcbenz
Copy link
Collaborator

zcbenz commented Jan 21, 2026

Many thanks maintaining a working branch 🙏 ! I'm in the progress of getting access to Windows hardware so it will take some time before I can look into this.

@dhiltgen dhiltgen changed the title WIP Revive Windows support WIP Windows CUDA support Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants