Skip to content

🐛 [Bug] run llm exmples are broken in thor platform #4069

@lanluo-nvidia

Description

@lanluo-nvidia

Bug Description

Trying to export the model using torch.export.export()..
Trying torch.export._trace._export to trace the graph since torch.export.export() failed
WARNING:py.warnings:/opt/conda/envs/py310/lib/python3.10/copyreg.py:101: FutureWarning: `isinstance(treespec, LeafSpec)` is deprecated, use `isinstance(treespec, TreeSpec) and treespec.is_leaf()` instead.
  return cls.__new__(cls, *args)

WARNING:py.warnings:/workspace/TensorRT/tools/llm/cache_utils.py:126: FutureWarning: `treespec.children_specs` is deprecated. Use `treespec.child(index)` to access a single child, or `treespec.children()` to get all children.
  in_spec_for_args = in_spec.children_specs[0]

WARNING:py.warnings:/workspace/TensorRT/tools/llm/cache_utils.py:134: FutureWarning: `treespec.children_specs` is deprecated. Use `treespec.child(index)` to access a single child, or `treespec.children()` to get all children.
  in_spec_for_args.children_specs.append(_LEAF_SPEC)

WARNING:py.warnings:/workspace/TensorRT/tools/llm/cache_utils.py:139: FutureWarning: `treespec.children_specs` is deprecated. Use `treespec.child(index)` to access a single child, or `treespec.children()` to get all children.
  for child_spec in spec.children_specs:

Traceback (most recent call last):
  File "/workspace/TensorRT/tools/llm/run_llm.py", line 323, in <module>
    trt_model = compile_torchtrt(model, input_ids, args)
  File "/workspace/TensorRT/tools/llm/run_llm.py", line 134, in compile_torchtrt
    trt_model = torch_tensorrt.dynamo.compile(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch_tensorrt/dynamo/_compiler.py", line 764, in compile
    gm = post_lowering(gm, settings)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch_tensorrt/dynamo/lowering/passes/_aten_lowering_pass.py", line 139, in post_lowering
    fake_tensor_updater.incremental_update(fake_mode)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch_tensorrt/dynamo/lowering/passes/_FakeTensorUpdater.py", line 193, in incremental_update
    new_fake_tensor = node.target(*args, **kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_ops.py", line 819, in __call__
    return self._op(*args, **kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/utils/_stats.py", line 29, in wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1397, in __torch_dispatch__
    return self.dispatch(func, types, args, kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2155, in dispatch
    return self._cached_dispatch_impl(func, types, args, kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1548, in _cached_dispatch_impl
    entry = self._make_cache_entry(state, key, func, args, kwargs, output)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1969, in _make_cache_entry
    output_info = self._get_output_info_for_cache_entry(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1863, in _get_output_info_for_cache_entry
    synth_output = self._output_from_cache_entry(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2076, in _output_from_cache_entry
    return self._get_output_tensor_from_cache_entry(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2031, in _get_output_tensor_from_cache_entry
    empty = torch.empty_strided(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/sym_node.py", line 562, in expect_true
    return self.shape_env.guard_or_defer_runtime_assert(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/recording.py", line 273, in wrapper
    return retlog(fn(*args, **kwargs))
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 7779, in guard_or_defer_runtime_assert
    static_expr = self._maybe_evaluate_static(expr)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2636, in wrapper
    return fn_cache(self, *args, **kwargs)
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 6437, in _maybe_evaluate_static
    r = _maybe_evaluate_static_worker(
  File "/opt/conda/envs/py310/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2421, in _maybe_evaluate_static_worker
    assert vr is not None
AssertionError

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingstory: LLM & Generative AILarge language models (GPT2, Llama, Mistral, Qwen), diffusion models (FLUX, SD), VLMs, MoE, attentio

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions