Conversation
878f4b4 to
27703d5
Compare
Signed-off-by: Torch-TensorRT Github Bot <torch-tensorrt.github.bot@nvidia.com>
27703d5 to
ef0662c
Compare
narendasan
left a comment
There was a problem hiding this comment.
Just reviewed the core stuff for now. I think this mostly is not really solving the issue. The core idea is that we want to have a Python implementation of Torchbind endpoints (execute_engine / TRTEngine) that lets us run the same programs with either standard torch-trt or python only rather than just having two implementations that are kind of mixed together
| ) | ||
|
|
||
|
|
||
| @torch.library.register_fake("tensorrt::execute_engine_python") # type: ignore |
There was a problem hiding this comment.
Why do we need a seperate operator for this arent we just changing the implementation of TRTEngine to either be python or C++?
There was a problem hiding this comment.
The problem is that if its a separate op then you can't interchange between C++ and Python only builds
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class PythonTRTEngine: |
There was a problem hiding this comment.
I think this class should be "TRTEngine" and only "registered" if the C++ runtime is unavailable. It should also be a valid script object so that the same operator works with the Python and C++ versions of the objects and it should uses the exact same APIs as the ones we expose in the JIT_hooks file
| register_opaque_type(PythonTRTEngine, typ="reference") | ||
|
|
||
|
|
||
| @torch.library.custom_op( # type: ignore[misc] |
There was a problem hiding this comment.
Same thing here. this operator should only get registered if the C++ library is not available and it should take the name of the C++ op
| def execute_engine_python( | ||
| input_tensors: List[torch.Tensor], engine: PythonTRTEngine | ||
| ) -> List[torch.Tensor]: | ||
| outputs = engine.execute(input_tensors) |
There was a problem hiding this comment.
Would rather use a struct + function design rather than some masked call to a method similar to the c++ structure
There was a problem hiding this comment.
Its cool that we have this but we should look into if there is a way to drop / mask registrations to change the runtime implementation rather than relying on distinct graph constructions
| return | ||
|
|
||
| if self._is_python_runtime: | ||
| self.engine = PythonTRTEngine( |
There was a problem hiding this comment.
Yeah we should be trying to monitor torch bind registration and register a class if there is no C++ api rather than two code paths
| metadata = pickle.loads(dumped_metadata) | ||
| return metadata | ||
| def decode_metadata(encoded_metadata: bytes | str) -> Any: | ||
| if isinstance(encoded_metadata, str): |
There was a problem hiding this comment.
Why was this rewritten?
| ) | ||
|
|
||
| def set_extra_state(self, state: SerializedTorchTensorRTModuleFmt) -> None: | ||
| def set_extra_state(self, state: TorchTensorRTModuleExtraState) -> None: |
There was a problem hiding this comment.
Why are we changing any of this it should be the same
| metadata["output_tensors_are_unowned"] | ||
| ) | ||
|
|
||
| def __del__(self) -> None: |
There was a problem hiding this comment.
Is this necessary, cant we just use del in the actual TRTEngine class?
There was a problem hiding this comment.
Generally theres a lot of changes in this file I dont really understand why we are changing. Isnt the entire point of this feature to detect when the C++ runtime is not available and drop in compatible Python implementations of C++ runtime APIs? rather than just fold two separate implementations into one class.
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: