Skip to content

Does tensorrt support dynamic quantization? #4698

@sarahliu-cisco

Description

@sarahliu-cisco

Description

I used quantize_dynamic from onnxruntime to quantize onnx model. But it could not convert to tensorrt plan.
error is Non-zero zero point is not supported. Do you know how to fix it?

[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:138: --- Begin node ---
input: "/model/embeddings/tok_embeddings/Gather_output_0_quantized"
input: "model.embeddings.tok_embeddings.weight_scale"
input: "model.embeddings.tok_embeddings.weight_zero_point"
output: "/model/embeddings/tok_embeddings/Gather_output_0"
name: "/model/embeddings/tok_embeddings/Gather_output_0_DequantizeLinear"
op_type: "DequantizeLinear"

[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:139: --- End node ---
[02/11/2026-23:33:24] [E] [TRT] ModelImporter.cpp:141: ERROR: onnxOpImporters.cpp:1584 In function QuantDequantLinearHelper:
[6] Assertion failed: shiftIsAllZeros(zeroPoint): Non-zero zero point is not supported. Please set kENABLE_UINT8_AND_ASYMMETRIC_QUANTIZATION_DLAto enable asymmetric quantization if it is on DLA.

import onnx
from onnxruntime.quantization import quantize_dynamic, QuantType

model_fp32 = 'onnx_models/model.onnx'
model_quant = 'onnx_models/model.quant.opt.onnx'
opt = {
"WeightSymmetric": True,
"ActivationSymmetric": True,
}
quantized_model = quantize_dynamic(model_fp32, model_quant, extra_options=opt)

Environment

TensorRT Version:

NVIDIA GPU:

NVIDIA Driver Version:

CUDA Version:

CUDNN Version:

Operating System:

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link:

Steps To Reproduce

Commands or scripts:

Have you tried the latest release?:

Attach the captured .json and .bin files from TensorRT's API Capture tool if you're on an x86_64 Unix system

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions