Does tensorrt support dynamic quantization?

## Description

I used quantize_dynamic from onnxruntime to quantize onnx model. But it could not convert to tensorrt plan. 
error is Non-zero zero point is not supported. Do you know how to fix it? 

[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:138: --- Begin node ---
input: "/model/embeddings/tok_embeddings/Gather_output_0_quantized"
input: "model.embeddings.tok_embeddings.weight_scale"
input: "model.embeddings.tok_embeddings.weight_zero_point"
output: "/model/embeddings/tok_embeddings/Gather_output_0"
name: "/model/embeddings/tok_embeddings/Gather_output_0_DequantizeLinear"
op_type: "DequantizeLinear"

[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:139: --- End node ---
[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:141: ERROR: onnxOpImporters.cpp:1584 In function QuantDequantLinearHelper:
[6] Assertion failed: shiftIsAllZeros(zeroPoint): Non-zero zero point is not supported. Please set kENABLE_UINT8_AND_ASYMMETRIC_QUANTIZATION_DLAto enable asymmetric quantization if it is on DLA.



import onnx
from onnxruntime.quantization import quantize_dynamic, QuantType

model_fp32 = 'onnx_models/model.onnx'
model_quant = 'onnx_models/model.quant.opt.onnx'
opt = {
    "WeightSymmetric": True,
    "ActivationSymmetric": True,
}
quantized_model = quantize_dynamic(model_fp32, model_quant, extra_options=opt)

## Environment



**TensorRT Version**:

**NVIDIA GPU**:

**NVIDIA Driver Version**:

**CUDA Version**:

**CUDNN Version**:


Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):


## Relevant Files



**Model link**:


## Steps To Reproduce



**Commands or scripts**:

**Have you tried [the latest release](https://developer.nvidia.com/tensorrt)?**:

**Attach the captured .json and .bin files from [TensorRT's API Capture tool](https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/capture-replay.html) if you're on an x86_64 Unix system**

**Can this model run on other frameworks?** For example run ONNX model with ONNXRuntime (`polygraphy run <model.onnx> --onnxrt`):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does tensorrt support dynamic quantization? #4698

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does tensorrt support dynamic quantization? #4698

Description

Description

Environment

Relevant Files

Steps To Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions