Skip to content

Guard float64 atomic add for CUDA with GOOGLE_CUDA macro#3179

Merged
i-chaochen merged 1 commit intor2.18-rocm-enhancedfrom
r2.18-gaurd_atomics_with_cuda_macro
Mar 6, 2026
Merged

Guard float64 atomic add for CUDA with GOOGLE_CUDA macro#3179
i-chaochen merged 1 commit intor2.18-rocm-enhancedfrom
r2.18-gaurd_atomics_with_cuda_macro

Conversation

@hsharsha
Copy link
Copy Markdown

@hsharsha hsharsha commented Mar 3, 2026

Motivation

Put cuda specific code under GOOGLE_CUDA macro
Solves slowness seen as reported int https://amd-hub.atlassian.net/browse/ROCM-3072

Submission Checklist

@hsharsha hsharsha force-pushed the r2.18-gaurd_atomics_with_cuda_macro branch from 9c6d8f0 to 34351a8 Compare March 4, 2026 11:00
@i-chaochen i-chaochen merged commit e52cdda into r2.18-rocm-enhanced Mar 6, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants