Commit 3f0bfd3
committed
Revert skip-softmax threshold formula change: restore * sm_scale
The * sm_scale factor is intentional: it scales the tile-skip threshold
relative to head dimension, so larger head_dim (smaller sm_scale) produces
more aggressive sparsity for the same lambda value. The previous 'fix' was
incorrect.
Signed-off-by: Ye Yu <yeyu@nvidia.com>1 parent 3ed4ba8 commit 3f0bfd3
1 file changed
Lines changed: 4 additions & 16 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1003 | 1003 | | |
1004 | 1004 | | |
1005 | 1005 | | |
1006 | | - | |
1007 | | - | |
1008 | | - | |
1009 | | - | |
1010 | | - | |
1011 | | - | |
1012 | | - | |
1013 | | - | |
1014 | | - | |
1015 | | - | |
1016 | | - | |
1017 | | - | |
1018 | | - | |
1019 | | - | |
1020 | | - | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
1021 | 1009 | | |
1022 | 1010 | | |
1023 | 1011 | | |
1024 | 1012 | | |
1025 | 1013 | | |
1026 | 1014 | | |
1027 | 1015 | | |
1028 | | - | |
| 1016 | + | |
1029 | 1017 | | |
1030 | 1018 | | |
1031 | 1019 | | |
| |||
0 commit comments