-
Notifications
You must be signed in to change notification settings - Fork 243
Cythonize _program.py #1565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cythonize _program.py #1565
Conversation
|
/ok to test 85dbbb5 |
|
| mod_obj = ObjectCode.from_ptx(ptx, symbol_mapping=sym_map) | ||
| assert mod.code == ptx | ||
| if not Program._can_load_generated_ptx(): | ||
| if not Program.driver_can_load_nvrtc_ptx_output(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worthy of discussion. I changed this private API into a public one after refactoring because it seems somewhat useful to an end user. An alternative would be to continue using the private API if the value here is too low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd definitely not make the API change in this PR, in part so that such a subtle change doesn't get drowned out by the many other changes here, but also to avoid entangled unintended side-effects from the disjoint changes (those tend to be hard to sort out, especially later, after this PR is merged already).
- Rename _program.py to _program.pyx - Convert Program to cdef class with _program.pxd declarations - Extract _MembersNeededForFinalize to module-level _ProgramMNFF (nested classes not allowed in cdef class) - Add __repr__ method to Program - Keep ProgramOptions as @DataClass (unchanged) - Keep weakref.finalize pattern for handle cleanup
- Move _translate_program_options to Program_translate_options (cdef) - Move _can_load_generated_ptx to Program_can_load_generated_ptx (cdef) - Remove unused TYPE_CHECKING import block - Follow _memory/_buffer.pyx helper function patterns
- Reorganize file structure per developer guide (principal class first) - Add module docstring, __all__, type alias section - Factor long methods into cdef inline helpers - Add proper exception specs to cdef functions - Fix docstrings (use :class: refs, public paths) - Add type annotations to public methods - Inline _nvvm_exception_manager (single use) - Remove Union import, use | syntax - Add public Program.driver_can_load_nvrtc_ptx_output() API - Update tests to use new public API Closes NVIDIA#1082
Add fixtures for different Program backends (NVRTC, PTX, NVVM) and ObjectCode code types (cubin, PTX, LTOIR). Split API_TYPES into more precise HASH_TYPES, EQ_TYPES, and WEAKREF_TYPES lists. Derive DICT_KEY_TYPES and WEAK_KEY_TYPES for collection tests.
- Add NvrtcProgramHandle and NvvmProgramHandle to resource handles module - Add function pointer initialization for nvrtcDestroyProgram and nvvmDestroyProgram - Forward-declare nvvmProgram to avoid nvvm.h dependency - Refactor detail::make_py to accept module name parameter - Remove _ProgramMNFF class from _program.pyx - Program now uses typed handles directly with RAII cleanup - Update handle property to return None when handle is null
- Add NVVMError exception class - Add HANDLE_RETURN_NVRTC for nogil NVRTC error handling with program log - Add HANDLE_RETURN_NVVM for nogil NVVM error handling with program log - Remove vestigial supported_error_type fused type - Simplify HANDLE_RETURN to directly take cydriver.CUresult
85dbbb5 to
44fd98e
Compare
|
/ok to test 44fd98e |
| // Forward declaration for NVVM - avoids nvvm.h dependency | ||
| // Use void* to match cuda.bindings.cynvvm's typedef for compatibility | ||
| #ifndef CYTHON_EXTERN_C | ||
| typedef void *nvvmProgram; | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since nvvm is an optional dependency, I opted to define nvvmProgram this way rather than include the header.
- Change cdef function return types from ObjectCode to object (Cython limitation) - Remove unused imports: intptr_t, NvrtcProgramHandle, NvvmProgramHandle, as_intptr - Update as_py(NvvmProgramHandle) to return Python int via PyLong_FromSsize_t - Update test assertions: remove handle checks after close(), test idempotency instead - Update NVVM error message regex to match new unified format
44fd98e to
0f06313
Compare
|
/ok to test 0f06313 |
Summary
Converts
_program.pyto Cython (_program.pyx) with full cythonization of NVRTC and NVVM paths, RAII-based resource management, andnogilerror handling.Changes
Cythonization
_program.py→_program.pyxwithcdef class Program_program.pxdwith typed attribute declarationsProgram_initcythonized with directcynvrtc/cynvvmcalls innogilblocksProgram_compile_nvrtcandProgram_compile_nvvmusenogilfor C API callscdef inlinehelpersResource Handles
NvrtcProgramHandleandNvvmProgramHandle(std::shared_ptr-based RAII)create_nvrtc_program_handle()andcreate_nvvm_program_handle()_MembersNeededForFinalize/ weakref.finalize patternas_py(NvvmProgramHandle)returns Pythonint(matches cuda_bindings design)Error Handling
HANDLE_RETURN_NVRTCfornogilNVRTC error handling with program logHANDLE_RETURN_NVVMfornogilNVVM error handling with program logNVVMErrorexception classHANDLE_RETURNto directly takecydriver.CUresultValidation & API
SUPPORTED_TARGETSmap for unified backend/target validationProgram.driver_can_load_nvrtc_ptx_output()APICode Quality
__all__, type alias section added:class:refs and public pathsUnionimport removed in favor of|syntaxstd::vector<const char*>instead ofmalloc/freeTests
test_object_protocols.pywithProgramandObjectCodevariationstest_program.py: testclose()idempotency instead of handle stateTest plan
test_program.pytests passtest_module.pytests passtest_object_protocols.pypasses with new variationspre-commitpassesCloses #1082