feat: pluggable index cache via CacheBackend trait#6222
feat: pluggable index cache via CacheBackend trait#6222wjones127 merged 27 commits intolance-format:mainfrom
Conversation
The Session's index cache was hardcoded to use Moka. This adds a CacheBackend trait so users can provide their own cache implementation (e.g. Redis-backed, disk-backed, shared across processes). Two-layer design: - CacheBackend: object-safe async trait with opaque byte keys. This is what plugin authors implement (get, insert, invalidate_prefix, clear, num_entries, size_bytes). - LanceCache: typed wrapper handling key construction (prefix + type tag), type-safe get/insert, DeepSizeOf size computation, hit/miss stats, and concurrent load deduplication. MokaCacheBackend is the default, preserving existing behavior. Custom backends are wired through Session::with_index_cache_backend() or DatasetBuilder::with_index_cache_backend(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR Review: feat: pluggable index cache via CacheBackend traitP0: TOCTOU race in
|
Add type_name()/type_id() to CacheKey and UnsizedCacheKey traits so backends can identify the type of cached entries. Add parse_cache_key() utility for backends to extract (user_key, type_id) from opaque key bytes. CacheKey-based methods now pipe the key's type_id through to the backend. Non-CacheKey methods use type_id_of::<T>() as a sentinel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Remove #[cfg(test)] convenience methods; tests now use CacheKey via a TestKey helper, eliminating the parallel method hierarchy. 2. Fix dedup race condition: re-check the cache while holding the in-flight lock so no two tasks can both become leader for the same key. 3. Use Arc::try_unwrap on the leader error path to preserve the original error type when possible. 4. Make invalidate_prefix async instead of fire-and-forget spawn. 5. Replace type_name().as_ptr() with a hash of std::any::TypeId for stable type discrimination. Defined once in type_id_of() and used by CacheKey::type_id() default. 6. Add dedup to WeakLanceCache::get_or_insert, sharing the in-flight map from the parent LanceCache. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Address feedback: 1. Move get_or_insert() onto CacheBackend. The method takes a pinned future (not a closure), so LanceCache can type-erase the user's non-'static loader before passing it to the backend. Default impl does simple get-then-insert; MokaCacheBackend uses moka's built-in optionally_get_with for dedup. This eliminates duplicated dedup logic and the manual watch-channel machinery. 2. Restore type_name().as_ptr() for type_id derivation on CacheKey. Remove standalone type_id_of() function. The derivation lives in one place: CacheKey::type_id()/UnsizedCacheKey::type_id(). 3. Remove approx_size_bytes from CacheBackend trait and Session debug output. Only approx_num_entries remains. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
56a3273 to
00867ad
Compare
Remove all methods that bypass CacheKey from WeakLanceCache (get, insert, get_or_insert, get_unsized, insert_unsized). Remove insert_unsized/get_unsized from LanceCache. Remove type_tag helper. All cache access now goes through CacheKey/UnsizedCacheKey. Make parse_cache_key return (empty, 0) instead of panicking on short keys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
369afe2 to
1ba4ac3
Compare
The type_name().as_ptr() approach for type discrimination was unstable across crate boundaries due to monomorphization. Replace with an explicit fn type_id() -> &'static str that each CacheKey impl provides as a short human-readable literal (e.g. 'Vec<IndexMetadata>', 'Manifest'). Key format changes from user_key\0<8 LE bytes> to user_key\0<type_id str>. parse_cache_key() now returns (&[u8], &str).
Add IvfIndexState struct and serialization to lance-index, enabling IVFIndex to export its reconstructable state (IVF model, quantizer metadata) without non-serializable handles. Add reconstruct_vector_index() which rebuilds an IVFIndex from cached state by re-opening FileReaders (cheap with warm metadata cache) instead of re-fetching global buffers from object storage. Also adds IvfQuantizationStorage::from_cached() to skip global buffer reads during reconstruction, and Session::file_metadata_cache() to expose the metadata cache for the reconstruction context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reconstructed VectorIndex instances need the original cache key prefix to share partition entries with the two-tier cache backend. Also adds LanceCache::with_backend_and_prefix() and WeakLanceCache::prefix(). Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Previously, the disk cache codec reconstructed `Arc<dyn VectorIndex>` from `IvfIndexState` during deserialization, requiring a `ReconstructionContext` with deferred OnceLock initialization and sync-to-async runtime juggling. The ObjectStore in that context also lacked proper credential wrappers. Now the cache stores `Arc<dyn VectorIndexData>` (serializable state) instead of `Arc<dyn VectorIndex>` (live index). Lance's `open_vector_index()` detects cached state and reconstructs using its own ObjectStore (with credentials) and metadata cache. This eliminates the ReconstructionContext, OnceLock pattern, and runtime juggling. Changes: - Add VectorIndexData trait (lance-index) with write_to/as_any/tag - Add DeepSizeOf impl for IvfIndexState - Change VectorIndexCacheKey::ValueType to dyn VectorIndexData - Add reconstruction-from-cache path in open_vector_index() - Fix panicking downcast in LanceCache::get_with_id (return None) - Add Debug/Clone/Copy derives to SubIndexType Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Split cache.rs into submodules (backend, keys, moka, mod) - Rename CacheKey::type_id() to type_name() across all implementors - Improve CacheBackend and get_or_insert docs - Add Spillable trait with writer-based serialize for partition_serde - Cache file metadata and file sizes to enable zero-IO reconstruction - Add test_reconstruct_from_cache_zero_io test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The method was renamed in lance-format#6209 but the test call site in v2.rs was not updated during the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The file_sizes parameter on IVFIndex::try_new and the file_size_map() usage in open_vector_index were from merged PR lance-format#5497, not the serialization PR. Restoring them avoids unnecessary HEAD requests. Also restores vector index cache check in open_generic_index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… contract - Presize `make_cache_key` Vec to avoid intermediate String allocation - Extract `cache_entry_size` helper to replace magic `+ 8` pattern - Document `type_name` uniqueness requirement on CacheKey/UnsizedCacheKey - Remove unused derives from SubIndexType (confirmed compiles without them) - Use moka's `weighted_size()` instead of iterating in `approx_size_bytes` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Inline `get_or_insert_with_id`, `insert_unsized_with_id`, and `get_unsized_with_id` into their sole public callers. Keep `insert_with_id` and `get_with_id` as shared helpers since they're used by both sized and unsized paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace `&[u8]` keys in `CacheBackend` with a structured `InternalCacheKey` type that provides direct access to prefix, key, and type_name fields. This eliminates the need for `parse_cache_key()` and `make_cache_key()`. Also change `get_or_insert` to return `(CacheEntry, bool)` where the bool indicates whether the entry was already cached. This enables accurate hit/miss tracking instead of counting all get_or_insert calls as misses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`weighted_size()` can be stale without `run_pending_tasks()` (which is async). Revert `approx_size_bytes` to iterating entries so the synchronous `DeepSizeOf` path returns accurate values. Also remove reference to private `keys` module in module-level doc. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert cache module split, CacheBackend trait, and related changes that are in the pluggable-index-cache PR. Keep only the partition serde and FlatBin quantizer changes unique to this branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…acheKey::type_name No implementation uses `self`, so make `type_name` an associated function instead of a method. Update all call sites to use `K::type_name()`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| /// caller. Boxing the future once at the call site (in `LanceCache`) | ||
| /// avoids this lifetime conflict while keeping the trait object-safe. | ||
| /// | ||
| /// The future borrows from the caller's scope and will be `.await`ed within |
There was a problem hiding this comment.
All this description feels kind of redundant / obvious. Like, if im a cache and i keep the loader around im gonna start leaking memory or file handles or something.
My first instinct is that its clutter but maybe in this day of agents it is better safe than sorry?
| async fn num_entries(&self) -> usize; | ||
|
|
||
| /// Total weighted size in bytes of all stored entries (may flush pending operations). | ||
| async fn size_bytes(&self) -> usize; |
There was a problem hiding this comment.
We should assume here that no two entries share buffers? It's probably the best we can do.
There was a problem hiding this comment.
Yes, we do. There are ways we can detect this in the future. You may be interested in this PR: #6229
|
|
||
| /// Approximate number of entries, callable from synchronous contexts. | ||
| /// Backends that cannot provide this cheaply should return 0. | ||
| fn approx_num_entries(&self) -> usize { |
There was a problem hiding this comment.
Would it be better for these to return Option with default of None?
| } | ||
| } | ||
|
|
||
| pub trait CacheKey { |
There was a problem hiding this comment.
Why is there both InternalCacheKey and CacheKey? Should this trait be documented?
| fn type_name() -> &'static str; | ||
| } | ||
|
|
||
| pub trait UnsizedCacheKey { |
There was a problem hiding this comment.
Document this? Why is it needed? (Yes, I know this is just copied from existing but still)
| /// Short, stable string that distinguishes this value type from others in | ||
| /// the cache. Used as the suffix in the encoded cache key (`user_key\0type_name`). | ||
| /// | ||
| /// **Must be unique per value type.** If two `CacheKey` impls return the |
There was a problem hiding this comment.
"Unique per value type" is strange wording. I think you're trying to say that two type names are equal only if their value types are equal?
Or, to put another way, if two value types are not equal they must not have equal type names?
There was a problem hiding this comment.
Yes. This is meant to be used as a key in a registry of types we'll use later for deserialization of cache entries.
| } | ||
|
|
||
| fn type_name() -> &'static str { | ||
| std::any::type_name::<T>() |
There was a problem hiding this comment.
Didn't you say not to use this?
There was a problem hiding this comment.
Added a comment. We can use this internally, but shouldn't call it directly from other crates.
| /// same `type_name` but different `ValueType`s, entries will collide and | ||
| /// downcasts will fail silently (returning `None` on get). | ||
| /// | ||
| /// Must be consistent across crate boundaries — use a short literal, not |
There was a problem hiding this comment.
What does consistent across crate boundaries mean?
Restructure the cache module so the two audiences are clear: - mod.rs: user-facing API (LanceCache, CacheKey, UnsizedCacheKey) - backend.rs: implementor-facing API (CacheBackend, InternalCacheKey) Delete keys.rs, moving CacheKey/UnsizedCacheKey into mod.rs and InternalCacheKey into backend.rs. Also: rename has_prefix -> starts_with, trim verbose get_or_insert doc, add shared-buffer note to approx_size_bytes, improve type_name docs, explain std::any::type_name usage in previous/reader.rs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert cache module split, CacheBackend trait, and related changes that are in the pluggable-index-cache PR. Keep only the partition serde and FlatBin quantizer changes unique to this branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously the Session's index cache was hardcoded to Moka. This adds a
CacheBackendtrait so users can provide their own cache implementation,with
MokaCacheBackendas the default.Serialization and zero-IO reconstruction are in the follow-up PR #6223.