Skip to content

Fix CuPy uint8 overflow and CUDA cubic NaN fallback#1064

Merged
brendancol merged 1 commit intomasterfrom
fix/cupy-cubic-overflow-and-nan-fallback
Mar 24, 2026
Merged

Fix CuPy uint8 overflow and CUDA cubic NaN fallback#1064
brendancol merged 1 commit intomasterfrom
fix/cupy-cubic-overflow-and-nan-fallback

Conversation

@brendancol
Copy link
Contributor

Summary

Closes the two remaining CuPy/CUDA gaps from #1054:

  • CuPy resampling paths (_resample_cupy_native and _resample_cupy) now clip integer results before casting, preventing silent uint8 overflow wrapping from cubic ringing
  • CUDA cubic kernel (_resample_cubic_cuda) falls back to bilinear with weight renormalization when NaN neighbors are present, matching the CPU Numba JIT behavior

Test plan

  • 5 new tests verify uint8 clipping across nearest/bilinear/cubic for both CuPy native and map_coordinates paths
  • 2 new tests verify CUDA cubic NaN fallback produces valid values and matches CPU output
  • Full test_reproject.py suite passes (79/79)

Two remaining gaps from issue #1054:

1. CuPy resampling paths lacked integer clipping, so cubic ringing on
   uint8 data could silently wrap (e.g. 260 -> 4). Both _resample_cupy_native
   and _resample_cupy now clip results before casting, matching the NumPy path.

2. The CUDA cubic kernel wrote nodata when any of the 16 neighbors was NaN.
   It now falls back to bilinear with weight renormalization, matching the
   CPU Numba JIT behavior.
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Mar 23, 2026
@brendancol brendancol merged commit a4b7507 into master Mar 24, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant