Enable correct RoiAlign max mode with env var#27554
Draft
Conversation
Comment on lines
+10
to
+18
| #define ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | ||
| ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \ | ||
| RoiAlign, \ | ||
| kOnnxDomain, \ | ||
| 10, \ | ||
| 15, \ | ||
| T, \ | ||
| kCudaExecutionProvider, \ | ||
| (*KernelDefBuilder::Create()) \ |
Contributor
There was a problem hiding this comment.
Suggested change
| #define ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | |
| ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \ | |
| RoiAlign, \ | |
| kOnnxDomain, \ | |
| 10, \ | |
| 15, \ | |
| T, \ | |
| kCudaExecutionProvider, \ | |
| (*KernelDefBuilder::Create()) \ | |
| #define ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | |
| ONNX_OPERATOR_VERSIONED_TYPED_KERNEL_EX( \ | |
| RoiAlign, \ | |
| kOnnxDomain, \ | |
| 10, \ | |
| 15, \ | |
| T, \ | |
| kCudaExecutionProvider, \ | |
| (*KernelDefBuilder::Create()) \ |
Comment on lines
+23
to
+30
| #define ADD_TYPED_ROIALIGN_OP(T) \ | ||
| ONNX_OPERATOR_TYPED_KERNEL_EX( \ | ||
| RoiAlign, \ | ||
| kOnnxDomain, \ | ||
| 16, \ | ||
| T, \ | ||
| kCudaExecutionProvider, \ | ||
| (*KernelDefBuilder::Create()) \ |
Contributor
There was a problem hiding this comment.
Suggested change
| #define ADD_TYPED_ROIALIGN_OP(T) \ | |
| ONNX_OPERATOR_TYPED_KERNEL_EX( \ | |
| RoiAlign, \ | |
| kOnnxDomain, \ | |
| 16, \ | |
| T, \ | |
| kCudaExecutionProvider, \ | |
| (*KernelDefBuilder::Create()) \ | |
| #define ADD_TYPED_ROIALIGN_OP(T) \ | |
| ONNX_OPERATOR_TYPED_KERNEL_EX( \ | |
| RoiAlign, \ | |
| kOnnxDomain, \ | |
| 16, \ | |
| T, \ | |
| kCudaExecutionProvider, \ | |
| (*KernelDefBuilder::Create()) \ |
Comment on lines
83
to
+85
| #define SPECIALIZED_COMPUTE(T) \ | ||
| REGISTER_KERNEL_TYPED(T) \ | ||
| ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | ||
| ADD_TYPED_ROIALIGN_OP(T) \ |
Contributor
There was a problem hiding this comment.
Suggested change
| #define SPECIALIZED_COMPUTE(T) \ | |
| REGISTER_KERNEL_TYPED(T) \ | |
| ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | |
| ADD_TYPED_ROIALIGN_OP(T) \ | |
| #define SPECIALIZED_COMPUTE(T) \ | |
| ADD_VERSIONED_TYPED_ROIALIGN_OP(T) \ | |
| ADD_TYPED_ROIALIGN_OP(T) \ |
| T* top_data, \ | ||
| const bool is_mode_avg, \ | ||
| const bool use_max_bilinear_interp, \ | ||
| const bool half_pixel, \ |
Contributor
There was a problem hiding this comment.
Suggested change
| const bool half_pixel, \ | |
| const bool half_pixel, \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Re-implements the changes from PR #7354 with additional improvements. Fixes #6921 and #6146.
Problem
The RoiAlign operator's original max mode implementation used max(w1d1, w2d2, w3d3, w4d4) — taking the max of individually weighted pixel values per sample point. While this theoretically matches the ONNX reference implementation, an alternative approach — bilinear interpolation first (w1d1 + w2d2 + w3d3 + w4d4), then max across sample points — is arguably more correct for applications.
Additionally:
max(roi_width, 1)) for non-half_pixel mode was present in both CPU and CUDA providers and incorrectly rejected zero-size ROIs that should produce valid point sampling.roi_bin_grid_h/wvalues were not clamped to a minimum of 1, which could cause issues with certain ROI configurations.Solution
ORT_ROIALIGN_MAX_USE_BILINEAR_INTERPOLATION=1: Enables bilinear interpolation max mode on the CPU Execution Provider, matching PyTorch/Detectron2 behavior.Changes
use_max_bilinear_interp_member, env var parsing viaParseEnvironmentVariableWithDefault, removed outdated warningroi_bin_gridmin-1 clamping, added conditional max mode (ONNX spec vs bilinear interp)roi_bin_gridmin-1 clamping, and fully implemented theORT_ROIALIGN_MAX_USE_BILINEAR_INTERPOLATIONtoggle down to the interpolation device function.MaxModePositiveexpected values for ONNX spec, added test_roialign_mode_max (from ONNX test suite), added scoped env var tests for both modes, all operational on CPU and CUDA.Test Results
All 13 RoiAlign tests pass: