Skip to content

Experimental support for 4-bit compression with LUT per layer and per block#3684

Merged
AlexanderDokuchaev merged 102 commits intoopenvinotoolkit:developfrom
andreyanufr:aanuf/LUT_per_layer_merged
Jan 30, 2026
Merged

Experimental support for 4-bit compression with LUT per layer and per block#3684
AlexanderDokuchaev merged 102 commits intoopenvinotoolkit:developfrom
andreyanufr:aanuf/LUT_per_layer_merged

Conversation

@andreyanufr
Copy link
Copy Markdown
Collaborator

@andreyanufr andreyanufr commented Oct 8, 2025

Changes

Implemented computation of codebook based on k-means algorithm.

Reason for changes

Related tickets

CVS-169609

CVS-180243 for leftovers

Tests

https://github.com/openvinotoolkit/nncf/actions/runs/21363309569

alexsu52 and others added 30 commits September 2, 2024 13:22
Copy link
Copy Markdown
Collaborator

@daniil-lyakhov daniil-lyakhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor

Comment thread src/nncf/quantization/algorithms/weight_compression/codebook_estimation.py Outdated
Comment on lines +297 to +299
if reduction_axis == 0:
weight = fns.transpose(weight)
reduction_axis = 1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test for this condition would be nice

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyanufr please add to the ticket with leftovers

Comment thread tests/cross_fw/examples/example_scope.json
Comment thread src/nncf/tensor/functions/numeric.py

@pytest.mark.parametrize("value_type", [None, TensorDataType.float16, TensorDataType.f8e4m3, TensorDataType.int8])
@pytest.mark.parametrize("group_size", [-1, 4])
def test_adaptive_codebooks(value_type, group_size):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Suggest to add tests in separate file tests/openvino/native/quantization/weights_compression/test_adaptive_codebook.py

  2. Added tests don’t increase coverage or add new functionality.
    The current tests only verify that the function doesn’t fail with different arguments and check operation type types.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyanufr please add to the ticket with leftovers

Comment thread examples/llm_compression/openvino/smollm2_360m_adaptive_codebook/main.py Outdated
Comment thread src/nncf/quantization/algorithms/weight_compression/codebook_estimation.py Outdated
Comment thread src/nncf/quantization/algorithms/weight_compression/codebook_estimation.py Outdated
Comment thread src/nncf/quantization/algorithms/weight_compression/codebook_estimation.py Outdated
weighted_importance: Tensor | None = None


class KMeansWeighted:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to move it in separate code, and add tests for algorithm.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyanufr please add to the ticket with leftovers

Copy link
Copy Markdown
Contributor

@ljaljushkin ljaljushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no blocking comments from my side

@ljaljushkin
Copy link
Copy Markdown
Contributor

ljaljushkin commented Jan 29, 2026

@andreyanufr

before merge

  1. need to move to experimental folder adaptive codebook file
  2. add duration

before release

  1. mention in documentation adaptive codebook and mark as experimental

All remaining comments should be appended to the ticket as leftovers to address once the algorithm is transitioned from experimental status.

2) Added codebook estimation test duration.
@MaximProshin MaximProshin changed the title Aanuf/lut per layer and per block Experimental support for LUT per layer and per block Jan 30, 2026
@MaximProshin MaximProshin changed the title Experimental support for LUT per layer and per block Experimental support for 4-bot compression with LUT per layer and per block Jan 30, 2026
@MaximProshin MaximProshin changed the title Experimental support for 4-bot compression with LUT per layer and per block Experimental support for 4-bit compression with LUT per layer and per block Jan 30, 2026
@AlexanderDokuchaev AlexanderDokuchaev merged commit 6bb7dc6 into openvinotoolkit:develop Jan 30, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Public API-impacting changes Code Freeze documentation Improvements or additions to documentation NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants