Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

docs: fix comm GEMM overlap README typos
#3010 opened May 18, 2026 by LeSingh1 Loading…
Bitmap topk
#3009 opened May 18, 2026 by tdophung Collaborator Draft
13 tasks
Add GitHub actions to automatically mark community contributions
#3007 opened May 18, 2026 by ptrendx Member Loading…
1 of 6 tasks
Generalized Tensor Parallelism (GTP) org-contribution
#3005 opened May 18, 2026 by fanshiqing Member Loading…
6 of 13 tasks
Add optional core lib features to wheel build
#3004 opened May 17, 2026 by ksivaman Member Draft
6 of 13 tasks
Update cudnn-frontend to 1.23.0
#3003 opened May 17, 2026 by ksivaman Member Loading…
7 of 13 tasks
Add license to framework sdist builds 2.16.0
#3002 opened May 17, 2026 by ksivaman Member Loading…
6 of 13 tasks
Optimize function that loads pointers on GPU cpu_overhead refactor
#3001 opened May 16, 2026 by timmoon10 Collaborator Loading…
8 of 14 tasks
TritonKernelCall: CUDA graph compatibility
#3000 opened May 15, 2026 by tdophung Collaborator Loading…
6 of 13 tasks
Plumb FP8+THD
#2994 opened May 14, 2026 by sudhakarsingh27 Collaborator Loading…
13 tasks
CP Tests batching using subprocess worker pool
#2993 opened May 14, 2026 by sudhakarsingh27 Collaborator Loading…
8 of 9 tasks
refactor(distributed): deduplicate TE module class lookups with caching
#2992 opened May 14, 2026 by muutot Contributor Loading…
3 of 13 tasks
Improve TE Group MLP CPU Overhead cpu_overhead
#2991 opened May 14, 2026 by zhongbozhu Collaborator Loading…
13 tasks
Add codex/agents to .gitignore
#2990 opened May 14, 2026 by yaox12 Member Loading…
13 tasks
[JAX] Support for cuDNN-backed flex attention 2.16.0
#2985 opened May 13, 2026 by vcherepanov-nv Collaborator Loading…
4 of 13 tasks
[PyTorch] Support for cuDNN-backed flex attention 2.16.0
#2984 opened May 13, 2026 by vcherepanov-nv Collaborator Loading…
4 of 13 tasks
GGEMM+srelu kernels for MxFP8 Nemotron
#2981 opened May 12, 2026 by sraman-rgb Loading…
8 of 13 tasks
[Common, PyTorch] Improve mHC to match DeepSeek's implementation
#2978 opened May 12, 2026 by kainzhong Collaborator Loading…
9 of 13 tasks
[JAX] Improve JAX tutorial documentation 2.16.0
#2976 opened May 11, 2026 by jberchtold-nvidia Collaborator Loading…
8 of 13 tasks
[Pytorch][Bug] DCP Checkpoint Loading Fixes for FSDP2 with QuantizedModelInit 2.16.0 bug Something isn't working
#2974 opened May 11, 2026 by vthumbe1503 Collaborator Loading…
13 tasks
Implement 4over6 NVFP4 recipe community-contribution PRs from external contributor outside the core maintainers, representing community-driven work. fp4
#2972 opened May 9, 2026 by zianglih Contributor Loading…
8 of 13 tasks
[common] Grouped gemm update - nvfp4 for blackwell and fp8 blockwise hopper 2.16.0
#2971 opened May 8, 2026 by pggPL Collaborator Loading…
9 of 13 tasks
ProTip! Follow long discussions with comments:>50.