Skip to content

feat: add TurboQuant compressed vector backend#182

Open
dudegladiator wants to merge 4 commits into
cocoindex-io:mainfrom
dudegladiator:feat/turboquant-vector-backend
Open

feat: add TurboQuant compressed vector backend#182
dudegladiator wants to merge 4 commits into
cocoindex-io:mainfrom
dudegladiator:feat/turboquant-vector-backend

Conversation

@dudegladiator

Copy link
Copy Markdown

What

Adds an optional TurboQuant compressed vector-search backend alongside the existing sqlite-vec path, selectable at ccc init:

ccc init --backend turbo-quant            # 4-bit (default)
ccc init --backend turbo-quant --tq-bits 2
ccc init --backend sqlite-vec             # explicit default (unchanged)

TurboQuant (Zandieh et al., 2025) is a data-oblivious quantizer: random rotation → per-coordinate Lloyd-Max scalar quantization → 1-bit QJL residual for an unbiased inner-product estimate. No training or calibration.

Motivation

sqlite-vec stores raw float32 and is exact + fast, but the on-disk index grows large on big codebases. TurboQuant compresses the index ~8× at 4-bit with recall@10 ≈ 0.9, for projects where index size matters more than the last bit of ranking precision.

Measured results

On a real 64.8k-chunk Go/TS repo (d=384, 4-bit):

sqlite-vec turbo-quant (4-bit)
on-disk index 7.2 MB 0.9 MB (~8×)
recall@10 vs exact 1.0 ~0.92
warm query latency ~130 ms ~165 ms

Distortion bounds and estimator unbiasedness are unit-tested against the paper's Theorem 1 / Theorem 2.

Design

  • turbo_quant.py — rotation, Lloyd-Max codebooks (b=1..4), MSE quantizer, two-stage unbiased inner-product estimator, bit-packing. Pure NumPy, no app coupling.
  • tq_store.py — SQLite-backed compressed store with vectorized inner-product search; full filter parity (language/path/limit/offset) with the sqlite-vec path. Only a seed is persisted; rotation/QJL matrices are regenerated on load.
  • wiring — index-time quantization, query-time dispatch with a daemon-lifetime store cache (row-count invalidated), backend-agnostic index status, settings validation, --backend/--tq-bits CLI flags + interactive prompt.

Compatibility

Not a breaking change. sqlite-vec remains the default and its code path is untouched. The backend is recorded per-index; switching requires re-init + re-index.

Testing

New unit + e2e + benchmark coverage. All prek hooks pass (ruff, ruff-format, mypy on src/ and tests/, full pytest — 245 passed).

Related

None — opening for discussion. Happy to move the design discussion to Discord if preferred for a feature this size.

Data-oblivious vector quantizer: random rotation + per-coordinate
Lloyd-Max codebooks (b=1..4), MSE quantizer, and an unbiased two-stage
inner-product estimator (MSE + 1-bit QJL residual). Pure NumPy, no
storage or app coupling. Verified against the paper's distortion bounds
and estimator unbiasedness.
SQLite-backed compressed vector store: bit-packed rows, seed-reproducible
matrices (only the seed is persisted), and a vectorized NumPy
inner-product search with language/path/limit/offset filter parity.
Batched bitpack decode keeps load cheap at scale.
Make TurboQuant selectable at `ccc init` via `--backend turbo-quant`
(`--tq-bits`), alongside the default sqlite-vec path. Index-time
quantization, query-time dispatch with a daemon-lifetime store cache,
backend-agnostic index status, and settings validation. sqlite-vec
remains the default and its path is unchanged.
Add type annotations to the new TurboQuant tests (required by the mypy
pre-commit hook, which checks tests/ too), document the turbo-quant
backend and `--backend` / `--tq-bits` flags in the README, and apply
ruff-format normalizations.
@dudegladiator-devrev

Copy link
Copy Markdown

@georgeh0 @badmonster0, could you please review this new capability?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants