Experimental support for 4-bit compression with LUT per layer and per block by andreyanufr · Pull Request #3684 · openvinotoolkit/nncf

andreyanufr · 2025-10-08T17:19:34Z

Changes

Implemented computation of codebook based on k-means algorithm.

Reason for changes

Related tickets

CVS-169609

CVS-180243 for leftovers

Tests

https://github.com/openvinotoolkit/nncf/actions/runs/21363309569

daniil-lyakhov

Minor

daniil-lyakhov · 2026-01-26T14:56:35Z

+        if reduction_axis == 0:
+            weight = fns.transpose(weight)
+            reduction_axis = 1


A test for this condition would be nice

@andreyanufr please add to the ticket with leftovers

AlexanderDokuchaev · 2026-01-26T14:58:08Z


+@pytest.mark.parametrize("value_type", [None, TensorDataType.float16, TensorDataType.f8e4m3, TensorDataType.int8])
+@pytest.mark.parametrize("group_size", [-1, 4])
+def test_adaptive_codebooks(value_type, group_size):


Suggest to add tests in separate file tests/openvino/native/quantization/weights_compression/test_adaptive_codebook.py

Added tests don’t increase coverage or add new functionality.
The current tests only verify that the function doesn’t fail with different arguments and check operation type types.

@andreyanufr please add to the ticket with leftovers

AlexanderDokuchaev · 2026-01-26T16:05:09Z

+    weighted_importance: Tensor | None = None
+
+
+class KMeansWeighted:


Suggest to move it in separate code, and add tests for algorithm.

@andreyanufr please add to the ticket with leftovers

2) Removed unused dataset from example.

…yer_merged

ljaljushkin

no blocking comments from my side

ljaljushkin · 2026-01-29T08:55:05Z

@andreyanufr

before merge

need to move to experimental folder adaptive codebook file
add duration

before release

mention in documentation adaptive codebook and mark as experimental

All remaining comments should be appended to the ticket as leftovers to address once the algorithm is transitioned from experimental status.

2) Added codebook estimation test duration.

alexsu52 and others added 30 commits September 2, 2024 13:22

Support scale estimation inside GPTQ

488cacc

fix for INT4_ASYM

ee64877

Merge remote-tracking branch 'upstream/develop' into develop

f22e411

Merge remote-tracking branch 'upstream/develop' into develop

51b4d7b

Merge remote-tracking branch 'upstream/develop' into develop

f66cd1e

Merge remote-tracking branch 'upstream/develop' into develop

7ce5a53

Merge remote-tracking branch 'upstream/develop' into develop

f74d156

Merge remote-tracking branch 'upstream/develop' into develop

5288c79

Merge remote-tracking branch 'upstream/develop' into develop

1becf15

Merge remote-tracking branch 'upstream/develop' into develop

047d7d9

Merge remote-tracking branch 'upstream/develop' into develop

c0c7e57

Merge remote-tracking branch 'upstream/develop' into develop

b74dea1

Merge remote-tracking branch 'upstream/develop' into develop

26a9a77

Merge remote-tracking branch 'upstream/develop' into develop

25fcc2c

Merge remote-tracking branch 'upstream/develop' into develop

26d4887

Merge remote-tracking branch 'upstream/develop' into develop

7748233

Merge remote-tracking branch 'upstream/develop' into develop

df251b3

Merge remote-tracking branch 'upstream/develop' into develop

4c134c4

Merge remote-tracking branch 'upstream/develop' into develop

6147097

Merge remote-tracking branch 'upstream/develop' into develop

2b94d28

Merge remote-tracking branch 'upstream/develop' into develop

5e312a5

Merge remote-tracking branch 'upstream/develop' into develop

2c5e983

Merge remote-tracking branch 'upstream/develop' into develop

1d8db1e

Merge remote-tracking branch 'upstream/develop' into develop

7244f18

Merge remote-tracking branch 'upstream/develop' into develop

443048c

Merge remote-tracking branch 'upstream/develop' into develop

80d2d8a

Merge remote-tracking branch 'upstream/develop' into develop

06bb19b

Merge remote-tracking branch 'upstream/develop' into develop

5d97d87

Merge remote-tracking branch 'upstream/develop' into develop

ae7cece

Initial codebook estimation algorithm.

3bcd47b

andreyanufr added 8 commits January 22, 2026 10:38

Fixed test.

395b167

Fixed docstring style.

7b40bef

Fixed docstring style.

a0a840e

Fixed example description.

6bbb1e5

Changed per-block to across-blocks definition in parameters.

467ecd5

Changed per-group to across-blocks definition in functions.

d8816d9

Annotated with type hints.

02dbc2a

Annotated with type hints.

a41f395

daniil-lyakhov reviewed Jan 26, 2026

View reviewed changes

Updated example generation.

4c231ca

AlexanderDokuchaev requested changes Jan 26, 2026

View reviewed changes

andreyanufr added 7 commits January 26, 2026 21:04

Fixed test for codebook.

5e4d4a8

Added test for nonzero.

95b7909

Reduced example for adaptiva codebook.

4ae4d82

Reverted nncf version.

7b4ed0a

1) Fixed docstrings.

e6a32ef

2) Removed unused dataset from example.

Merge remote-tracking branch 'upstream/develop' into aanuf/LUT_per_la…

5d3f939

…yer_merged

Updated requirements.txt for adaptive codebook example.

60c9019

ljaljushkin approved these changes Jan 28, 2026

View reviewed changes

daniil-lyakhov approved these changes Jan 28, 2026

View reviewed changes

ljaljushkin added the experimental label Jan 29, 2026

github-actions Bot removed the experimental label Jan 29, 2026

1) Moved codebook estimation to experimental.

4d179cd

2) Added codebook estimation test duration.

MaximProshin changed the title ~~Aanuf/lut per layer and per block~~ Experimental support for LUT per layer and per block Jan 30, 2026

MaximProshin changed the title ~~Experimental support for LUT per layer and per block~~ Experimental support for 4-bot compression with LUT per layer and per block Jan 30, 2026

MaximProshin changed the title ~~Experimental support for 4-bot compression with LUT per layer and per block~~ Experimental support for 4-bit compression with LUT per layer and per block Jan 30, 2026

AlexanderDokuchaev approved these changes Jan 30, 2026

View reviewed changes

AlexanderDokuchaev merged commit 6bb7dc6 into openvinotoolkit:develop Jan 30, 2026
18 checks passed

andreyanufr mentioned this pull request Feb 6, 2026

[release_v300] release notes template #3876

Merged

		weighted_importance: Tensor \| None = None


		class KMeansWeighted:

Conversation

andreyanufr commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason for changes

Related tickets

Tests

Uh oh!

daniil-lyakhov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

daniil-lyakhov Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

ljaljushkin Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

AlexanderDokuchaev Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

ljaljushkin Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlexanderDokuchaev Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

ljaljushkin Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

ljaljushkin left a comment

Choose a reason for hiding this comment

Uh oh!

ljaljushkin commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

andreyanufr commented Oct 8, 2025 •

edited

Loading

ljaljushkin commented Jan 29, 2026 •

edited

Loading