Skip to content

Commit 54b94ab

Browse files
committed
Add real-model (Qwen3-8B) KV-cache quantization plots
Signed-off-by: Kai Xu <kaix@nvidia.com>
1 parent a230929 commit 54b94ab

5 files changed

Lines changed: 765 additions & 38 deletions

File tree

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# WaterSIC KV-Cache Quantization Example
2+
3+
Synthetic benchmarks comparing KV-cache quantization methods on controlled
4+
scenarios with known structure. Reproduces the plots from the WaterSIC paper.
5+
6+
## Methods compared
7+
8+
| Method | Format | Hessian-aware | KL-aware |
9+
|--------|--------|---------------|----------|
10+
| RTN NVFP-y | Block-scaled fixed-point | No | No |
11+
| GPTQ NVFP-y | Block-scaled fixed-point | Yes | No |
12+
| GPTQ-KL NVFP-y | Block-scaled fixed-point | Yes | Yes |
13+
| WaterSIC-L2 | Entropy-coded | Yes | No |
14+
| WaterSIC-KL | Entropy-coded | Yes | Yes |
15+
16+
## Usage
17+
18+
```bash
19+
# Generate all plots
20+
python kv_cache_watersic_plots.py plot_all
21+
22+
# Individual plots
23+
python kv_cache_watersic_plots.py plot_temperature
24+
python kv_cache_watersic_plots.py plot_powerlaw
25+
python kv_cache_watersic_plots.py plot_retrieval
26+
python kv_cache_watersic_plots.py plot_sinks
27+
python kv_cache_watersic_plots.py plot_sinks_cond
28+
29+
# Text-only sweeps (no matplotlib required)
30+
python kv_cache_watersic_plots.py temperature
31+
python kv_cache_watersic_plots.py powerlaw
32+
python kv_cache_watersic_plots.py retrieval
33+
python kv_cache_watersic_plots.py sinks
34+
python kv_cache_watersic_plots.py scaling
35+
```
36+
37+
Plots are saved to `plots/` directory.

0 commit comments

Comments
 (0)