Commit 07c1e73
authored
Fix cuDNN convolution precision on Ampere+ GPUs (#3127)
On Ampere and later GPUs (SM 8.0+), cuDNN's default math mode permits
TF32 Tensor Core operations which use reduced mantissa precision. This
causes numerical differences when comparing CUDA vs CPU convolution
results, particularly in cudnnConvolutionBackwardFilter().
Explicitly set CUDNN_FMA_MATH to force true FP32 computation for
consistent numerical results across all GPU architectures.1 parent 60adc65 commit 07c1e73
1 file changed
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1044 | 1044 | | |
1045 | 1045 | | |
1046 | 1046 | | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
1047 | 1056 | | |
1048 | 1057 | | |
1049 | 1058 | | |
| |||
0 commit comments