feat: add TurboQuant compressed vector backend#182
Open
dudegladiator wants to merge 4 commits into
Open
Conversation
Data-oblivious vector quantizer: random rotation + per-coordinate Lloyd-Max codebooks (b=1..4), MSE quantizer, and an unbiased two-stage inner-product estimator (MSE + 1-bit QJL residual). Pure NumPy, no storage or app coupling. Verified against the paper's distortion bounds and estimator unbiasedness.
SQLite-backed compressed vector store: bit-packed rows, seed-reproducible matrices (only the seed is persisted), and a vectorized NumPy inner-product search with language/path/limit/offset filter parity. Batched bitpack decode keeps load cheap at scale.
Make TurboQuant selectable at `ccc init` via `--backend turbo-quant` (`--tq-bits`), alongside the default sqlite-vec path. Index-time quantization, query-time dispatch with a daemon-lifetime store cache, backend-agnostic index status, and settings validation. sqlite-vec remains the default and its path is unchanged.
Add type annotations to the new TurboQuant tests (required by the mypy pre-commit hook, which checks tests/ too), document the turbo-quant backend and `--backend` / `--tq-bits` flags in the README, and apply ruff-format normalizations.
|
@georgeh0 @badmonster0, could you please review this new capability? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an optional TurboQuant compressed vector-search backend alongside the existing
sqlite-vecpath, selectable atccc init:TurboQuant (Zandieh et al., 2025) is a data-oblivious quantizer: random rotation → per-coordinate Lloyd-Max scalar quantization → 1-bit QJL residual for an unbiased inner-product estimate. No training or calibration.
Motivation
sqlite-vecstores rawfloat32and is exact + fast, but the on-disk index grows large on big codebases. TurboQuant compresses the index ~8× at 4-bit with recall@10 ≈ 0.9, for projects where index size matters more than the last bit of ranking precision.Measured results
On a real 64.8k-chunk Go/TS repo (d=384, 4-bit):
Distortion bounds and estimator unbiasedness are unit-tested against the paper's Theorem 1 / Theorem 2.
Design
turbo_quant.py— rotation, Lloyd-Max codebooks (b=1..4), MSE quantizer, two-stage unbiased inner-product estimator, bit-packing. Pure NumPy, no app coupling.tq_store.py— SQLite-backed compressed store with vectorized inner-product search; full filter parity (language/path/limit/offset) with thesqlite-vecpath. Only a seed is persisted; rotation/QJL matrices are regenerated on load.--backend/--tq-bitsCLI flags + interactive prompt.Compatibility
Not a breaking change.
sqlite-vecremains the default and its code path is untouched. The backend is recorded per-index; switching requires re-init + re-index.Testing
New unit + e2e + benchmark coverage. All
prekhooks pass (ruff, ruff-format, mypy onsrc/andtests/, full pytest — 245 passed).Related
None — opening for discussion. Happy to move the design discussion to Discord if preferred for a feature this size.