New LP on PTQ/QAT in ExecuTorch by annietllnd · Pull Request #2889 · ArmDeveloperEcosystem/arm-learning-paths

annietllnd · 2026-02-10T19:46:47Z

Wait for author feedback before merging

Burton2000 · 2026-02-16T14:05:13Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+## PTQ vs QAT: what changes in practice?
+
+PTQ and QAT both aim to run your model with quantized operators (typically INT8). The difference is where you pay the cost: PTQ optimizes for speed of iteration, while QAT optimizes for quality and robustness.


We should probably have the terms written out in full to start with before using the acronyms everywhere.

E.g.
Post-training quantization (PTQ) and Quantization-aware training (QAT)

Burton2000 · 2026-02-16T14:06:21Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

@@ -0,0 +1,46 @@
+---
+title: Understanding PTQ and QAT


Thoughts on some more detail in the title, e.g.

Understanding quantization with ExecuTorch and the Arm backend

Burton2000 · 2026-02-16T14:07:59Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+In this Learning Path, you use quantization as part of the ExecuTorch Arm backend. The goal is to export a quantized model that can run on Arm hardware with dedicated neural accelerators (NX).
+
+To keep the workflow concrete, you start with a complete, runnable CIFAR-10-based example that exports `.vgf` artifacts end to end. After you have a known-good baseline, you can apply the same steps to your own upscaler model and training loop.


to your own neural network and training code.

Burton2000 · 2026-02-16T14:15:26Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+In a nutshell, the Arm backend in ExecuTorch consists of the following building blocks:
+
+- TOSA (Tensor Operator Set Architecture) provides a standardized operator set for acceleration on Arm platforms.


provides an open, standardized, minimal operator set for neural networks operations to be lowered to. It is utilized by Arm platforms and accelerators.

Burton2000 · 2026-02-16T14:19:39Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+PTQ keeps training simple. You train your FP32 model as usual, then run a calibration pass using representative inputs to determine quantization parameters (for example, scales). After calibration, you convert the model and export a quantized graph.
+
+PTQ is a good default when you need a fast iteration loop and you have a calibration set that looks like the actual inference data. For upscalers, PTQ can be good enough for early bring-up, especially when your goal is to validate the export and integration path.


For neural networks

I would also add that PTQ, depending on the model and use-case can still provide good quality results equal to the original floating point graph.

Burton2000 · 2026-02-16T14:22:12Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+QAT simulates quantization effects during training. You prepare the model for QAT, fine-tune with fake-quantization enabled, then convert and export.
+
+QAT is worth the extra effort when PTQ introduces visible artifacts. This is common for image-to-image tasks because small numeric changes can show up as banding, ringing, or loss of fine detail.


introduces visible drop in model accuracy. For example, this is common for

Burton2000 · 2026-02-16T14:22:46Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+For Arm-based platforms, the workflow stays consistent across models:
+
+1. Train and evaluate the upscaler in PyTorch.


Train and evaluate the neural network in PyTorch.

Burton2000 · 2026-02-16T14:23:36Z

...learning-paths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/1-introduction.md

+
+1. Train and evaluate the upscaler in PyTorch.
+2. Quantize (PTQ or QAT) to reduce runtime cost.
+3. Export through TOSA and generate a `.vgf` artifact.


Export with ExecuTorch (via TOSA) to generate a .vgf artifact.

Burton2000 · 2026-02-16T14:28:45Z

...aths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/3-run-ptq-and-export-vgf.md

+
+After that, you take the same PTQ export logic and apply it to your own model and calibration data.
+
+## Run the end-to-end PTQ example (CIFAR-10)


I would just remove the CIFAR-10 part as this is just the dataset and not the task

Burton2000 · 2026-02-16T14:59:11Z

...aths/mobile-graphics-and-gaming/quantize-neural-upscaling-models/4-run-qat-and-export-vgf.md

+
+## Advanced: drop-in QAT export to VGF for your own project
+
+If PTQ introduces visible artifacts, QAT is the next step. The workflow is the same as PTQ, but you insert a short fine-tuning phase after you prepare the model for QAT.


If PTQ degrades model accuracy too much,

Burton2000 · 2026-02-16T15:00:50Z

...and-gaming/quantize-neural-upscaling-models/5-validate-and-choose-a-quantization-strategy.md

+
+You now have a complete reference workflow for quantizing an image-to-image model with TorchAO and exporting INT8 `.vgf` artifacts using the ExecuTorch Arm backend. You also have a practical baseline you can use to debug export issues before you switch to your production model and data.
+
+When you move from the CIFAR-10 proxy model to your own upscaler, keep these constraints in mind:


+When you move from the CIFAR-10 proxy model to your own model, keep these constraints in mind:

annietllnd added 2 commits February 10, 2026 11:45

New LP on PTQ/QAT in Model Gym

c781dc8

Update _index.md

39332a9

annietllnd added this to Arm Learning Paths Roadmap Feb 10, 2026

annietllnd moved this to In Progress in Arm Learning Paths Roadmap Feb 10, 2026

annietllnd self-assigned this Feb 10, 2026

annietllnd added the tech_review label Feb 10, 2026

Burton2000 reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New LP on PTQ/QAT in ExecuTorch#2889

New LP on PTQ/QAT in ExecuTorch#2889
annietllnd wants to merge 2 commits intoArmDeveloperEcosystem:mainfrom
annietllnd:neural-graphics

annietllnd commented Feb 10, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026 •

edited

Loading

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Burton2000 Feb 16, 2026 •

edited

Loading

Uh oh!

Burton2000 Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		## PTQ vs QAT: what changes in practice?

		PTQ and QAT both aim to run your model with quantized operators (typically INT8). The difference is where you pay the cost: PTQ optimizes for speed of iteration, while QAT optimizes for quality and robustness.


		In this Learning Path, you use quantization as part of the ExecuTorch Arm backend. The goal is to export a quantized model that can run on Arm hardware with dedicated neural accelerators (NX).

		To keep the workflow concrete, you start with a complete, runnable CIFAR-10-based example that exports `.vgf` artifacts end to end. After you have a known-good baseline, you can apply the same steps to your own upscaler model and training loop.


		In a nutshell, the Arm backend in ExecuTorch consists of the following building blocks:

		- TOSA (Tensor Operator Set Architecture) provides a standardized operator set for acceleration on Arm platforms.


		PTQ keeps training simple. You train your FP32 model as usual, then run a calibration pass using representative inputs to determine quantization parameters (for example, scales). After calibration, you convert the model and export a quantized graph.

		PTQ is a good default when you need a fast iteration loop and you have a calibration set that looks like the actual inference data. For upscalers, PTQ can be good enough for early bring-up, especially when your goal is to validate the export and integration path.


		QAT simulates quantization effects during training. You prepare the model for QAT, fine-tune with fake-quantization enabled, then convert and export.

		QAT is worth the extra effort when PTQ introduces visible artifacts. This is common for image-to-image tasks because small numeric changes can show up as banding, ringing, or loss of fine detail.


		For Arm-based platforms, the workflow stays consistent across models:

		1. Train and evaluate the upscaler in PyTorch.


		After that, you take the same PTQ export logic and apply it to your own model and calibration data.

		## Run the end-to-end PTQ example (CIFAR-10)


		## Advanced: drop-in QAT export to VGF for your own project

		If PTQ introduces visible artifacts, QAT is the next step. The workflow is the same as PTQ, but you insert a short fine-tuning phase after you prepare the model for QAT.


		You now have a complete reference workflow for quantizing an image-to-image model with TorchAO and exporting INT8 `.vgf` artifacts using the ExecuTorch Arm backend. You also have a practical baseline you can use to debug export issues before you switch to your production model and data.

		When you move from the CIFAR-10 proxy model to your own upscaler, keep these constraints in mind:

Conversation

annietllnd commented Feb 10, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Burton2000 Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Burton2000 Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Burton2000 Feb 16, 2026 •

edited

Loading

Burton2000 Feb 16, 2026 •

edited

Loading