Add explicit CUDA graph construction API (GraphDef, GraphNode)#1772
Add explicit CUDA graph construction API (GraphDef, GraphNode)#1772Andy-Jost wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
/ok to test |
|
f07dba3 to
6cb61af
Compare
|
/ok to test |
6cb61af to
49efb6b
Compare
|
/ok to test |
cpcloud
left a comment
There was a problem hiding this comment.
The explicit graph model looks promising, but I found a few correctness/API issues that make me uncomfortable approving this as-is. The biggest ones are entry-node identity, GraphDef() failing before CUDA init in a fresh process, and the ctypes host-callback lifetime. I left inline comments with concrete fixes.
86f7179 to
310b994
Compare
Introduces GraphDef and GraphNode types for explicit CUDA graph construction, with a full node hierarchy, shared instantiation helper with GraphCompleteOptions support, and comprehensive tests. Made-with: Cursor
310b994 to
b965385
Compare
|
/ok to test |
|
@cpcloud The latest upload fixes the issues you pointed out, except the CUDA init problem, which needs clarification. |
|
@Andy-Jost Since your PR is not a draft you shouldn't need ok-to-test anymore. |
cpcloud
left a comment
There was a problem hiding this comment.
Re-reviewed the latest head. The earlier GraphNode identity and callback lifetime issues are fixed, and I do not have any remaining blocking review items.
Summary
GraphDefandGraphNodetypes for explicit CUDA graph construction, complementing the existing stream-capture path viaGraphBuilder.GraphBuilder.complete()andGraphDef.instantiate()supportGraphCompleteOptions, and removes the pre-CUDA 12.0cuGraphInstantiateWithFlagsfallback.Depends on #1762 (merged).
Closes #1317.
Changes
_graphdef.pxd/_graphdef.pyxcontainingGraphDef,GraphNode, and all node subclasses_graph/__init__.py: Extracted_instantiate_graph()helper; updated exports for new typesresource_handles.cpp/.hpp: Added graph and graph-node RAII handles,HandleRegistryfor reverse handle lookup_resource_handles.pxd/.pyx: Cython declarations for graph handle types_memory/_buffer.pyx: Graph memory allocation support_utils/cuda_utils.pxd/.pyx: Utility additions for graph constructionGraphCompleteOptionsvariantsTest plan
test_explicit.py,test_explicit_errors.py,test_explicit_integration.py,test_explicit_lifetime.pytest_object_protocols.py