[ET-VK][ez] Make q8ta_conv2d use 4C1W layout #17390

SS-JIA · 2026-02-11T20:15:55Z

Stack from ghstack (oldest at bottom):

This changes the q8ta_conv2d and q8ta_conv2d_dw operators' input layout from PackedInt8_4W4C to PackedInt8_4C1W in the op registry. The 4C1W layout aligns with the natural output format of channel-packed convolutions, avoiding unnecessary layout conversions between consecutive conv layers.

Also adds explicit outputs_storage declarations (PACKED_INT8_CHANNELS_PACKED_BUFFER) to both the PW and general q8ta_conv2d op registrations, ensuring the layout propagation pass can correctly determine output layouts.

Differential Revision: D93000165

This changes the q8ta_conv2d and q8ta_conv2d_dw operators' input layout from PackedInt8_4W4C to PackedInt8_4C1W in the op registry. The 4C1W layout aligns with the natural output format of channel-packed convolutions, avoiding unnecessary layout conversions between consecutive conv layers. Also adds explicit `outputs_storage` declarations (PACKED_INT8_CHANNELS_PACKED_BUFFER) to both the PW and general q8ta_conv2d op registrations, ensuring the layout propagation pass can correctly determine output layouts. Differential Revision: [D93000165](https://our.internmc.facebook.com/intern/diff/D93000165/) [ghstack-poisoned]

pytorch-bot · 2026-02-11T20:16:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17390

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 1 Unrelated Failure

As of commit 03aec14 with merge base 964c565 ():

NEW FAILURES - The following jobs have failed:

Test CUDA Builds / export-model-cuda-artifact (google, gemma-3-4b-it, quantized-int4-tile-packed) / linux-job (gh)
RuntimeError: Command docker exec -t 9e4008bd61680d3ec45e28d7104127313fc94a507634b5b320c0e1e483156027 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (nvidia, parakeet-tdt, quantized-int4-weight-only) / linux-job (gh)
RuntimeError: Command docker exec -t 3c872cc7cd488090e1b5bea4a13633bf19fbec6aac7a863649e4d75d319dac91 /exec failed with exit code 1
Test CUDA Builds / export-model-cuda-artifact (openai, whisper-small, non-quantized) / linux-job (gh)
RuntimeError: Command docker exec -t 37f2f020dfd1ad0f42e00b80ee2489e51581827f20ab5223dcf7e8b2f2f86a55 /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (mv2) / linux-job (gh)
RuntimeError: Command docker exec -t 072623d3ae0b696b96d34989430aa457fcfb2c94815185f74a7d41a968a6a143 /exec failed with exit code 1
Test CUDA Builds / test-models-cuda (sdpa) / linux-job (gh)
RuntimeError: Command docker exec -t ee667ae77adc7320f5bcda1cffe3381cd6aa8c2a76e1ffcbd4ec34a438b759a9 /exec failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-samsung-models-linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-11T20:16:33Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

This changes the q8ta_conv2d and q8ta_conv2d_dw operators' input layout from PackedInt8_4W4C to PackedInt8_4C1W in the op registry. The 4C1W layout aligns with the natural output format of channel-packed convolutions, avoiding unnecessary layout conversions between consecutive conv layers. Also adds explicit `outputs_storage` declarations (PACKED_INT8_CHANNELS_PACKED_BUFFER) to both the PW and general q8ta_conv2d op registrations, ensuring the layout propagation pass can correctly determine output layouts. Differential Revision: [D93000165](https://our.internmc.facebook.com/intern/diff/D93000165/) [ghstack-poisoned]

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 11, 2026

meta-codesync bot added fb-exported meta-exported labels Feb 11, 2026

SS-JIA mentioned this pull request Feb 11, 2026

Back out "[Diff Train][pytorch/executorch] Apply fixup patch to fbsource" #17399

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK][ez] Make q8ta_conv2d use 4C1W layout #17390

[ET-VK][ez] Make q8ta_conv2d use 4C1W layout #17390

SS-JIA commented Feb 11, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[ET-VK][ez] Make q8ta_conv2d use 4C1W layout #17390

Are you sure you want to change the base?

[ET-VK][ez] Make q8ta_conv2d use 4C1W layout #17390

Conversation

SS-JIA commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17390

❌ 5 New Failures, 1 Unrelated Failure

Uh oh!

github-actions bot commented Feb 11, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SS-JIA commented Feb 11, 2026 •

edited

Loading

pytorch-bot bot commented Feb 11, 2026 •

edited

Loading

This PR needs a `release notes:` label