Skip to content

Add vector search Phase 4: filter-aware ANN, SQL radius, HNSW efSearch, quantizers, IVF_ON_DISK, adaptive planner#18119

Merged
xiangfu0 merged 2 commits intoapache:masterfrom
xiangfu0:vector-phase-4
Apr 10, 2026
Merged

Add vector search Phase 4: filter-aware ANN, SQL radius, HNSW efSearch, quantizers, IVF_ON_DISK, adaptive planner#18119
xiangfu0 merged 2 commits intoapache:masterfrom
xiangfu0:vector-phase-4

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 commented Apr 7, 2026

Summary

Completes the vector search roadmap (Phase 4) with seven feature areas that close the remaining gaps and leave Pinot's vector stack feature-complete for this release cycle.

  • Filter-aware ANN (FILTER_THEN_ANN): True pre-filter bitmap passed to HNSW and IVF backends via FilterAwareVectorIndexReader for improved recall on selective filters
  • SQL radius search: VECTOR_SIMILARITY_RADIUS(col, vec, threshold) for distance-based filtering with automatic brute-force fallback when ANN candidate pool is saturated
  • HNSW runtime tuning: vectorEfSearch query option via EfSearchAware interface with automatic top-K trimming to preserve predicate cardinality
  • Generic quantizer framework: VectorQuantizerType enum (FLAT/SQ8/SQ4/PQ), ScalarQuantizer with train/encode/decode/serialize (non-FLAT quantizers rejected at config validation until wired into index build path)
  • IVF_ON_DISK: Disk-backed IVF using FileChannel random-access reads with ThreadLocal buffer reuse (no 2GB limit)
  • Adaptive planner: VectorSearchStrategy with selectivity-aware mode selection, wired into FilterPlanNode for automatic pre-filter decisions
  • Metrics: VectorSearchMetrics singleton wired into VectorSimilarityFilterOperator and VectorRadiusFilterOperator for production observability

Files changed

42 files changed, 5,424 insertions, 57 deletions


User Manual

Table & Index Configuration

HNSW (default, supports mutable segments)

```json
{
"fieldConfigList": [{
"name": "embedding",
"indexTypes": ["VECTOR"],
"encodingType": "RAW",
"properties": {
"vectorDimension": "128",
"vectorDistanceFunction": "EUCLIDEAN"
}
}]
}
```

IVF_FLAT (offline only, nprobe-tunable)

```json
{
"properties": {
"vectorIndexType": "IVF_FLAT",
"vectorDimension": "128",
"vectorDistanceFunction": "EUCLIDEAN",
"nlist": "128",
"trainSampleSize": "10000"
}
}
```

IVF_ON_DISK (new — FileChannel-based, low heap)

```json
{
"properties": {
"vectorIndexType": "IVF_ON_DISK",
"vectorDimension": "128",
"vectorDistanceFunction": "EUCLIDEAN",
"nlist": "128"
}
}
```


Vector Query Options — Complete Reference

Pinot exposes five query-time options for tuning vector search behavior. Each serves a distinct, non-overlapping purpose. All are optional — queries without any options use sensible defaults and behave identically to prior releases.

1. vectorNprobe — IVF search effort

Applies to IVF_FLAT, IVF_PQ, IVF_ON_DISK
Type int (≥ 1)
Default 4
What it does Controls how many inverted lists (Voronoi cells) are probed during IVF search. Higher values scan more cells, improving recall at the cost of latency.
When to tune When IVF recall is too low — increase toward nlist for higher recall.

```sql
SET vectorNprobe = 32;
SELECT productId, l2Distance(embedding, ARRAY[...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[...], 10)
ORDER BY dist ASC LIMIT 10
```

2. vectorEfSearch — HNSW search effort

Applies to HNSW
Type int (≥ 1)
Default equal to topK
What it does Controls the size of the dynamic candidate list during HNSW graph traversal. Higher values explore more graph nodes, improving recall. Results are always trimmed back to the predicate's topK — this option only affects which topK are chosen, not how many are returned.
When to tune When HNSW recall matters more than latency — try 100–500 for better recall on large datasets.

```sql
SET vectorEfSearch = 200;
SELECT productId, l2Distance(embedding, ARRAY[...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[...], 10)
ORDER BY dist ASC LIMIT 10
```

3. vectorExactRerank — Accuracy boost via exact distance re-scoring

Applies to All backends
Type boolean
Default `true` for IVF_PQ, `false` for others
What it does When enabled, ANN candidates are re-scored using exact distance computation from the forward index and re-sorted before returning topK. This corrects any approximation error from the ANN index.
When to tune Enable for IVF_PQ or when using quantized indexes to recover accuracy lost from compression. Usually not needed for HNSW or IVF_FLAT.

```sql
SET vectorExactRerank = true;
SET vectorMaxCandidates = 100;
SELECT productId, l2Distance(embedding, ARRAY[...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[...], 10)
ORDER BY dist ASC LIMIT 10
```

4. vectorMaxCandidates — ANN candidate pool size

Applies to All backends (only meaningful with rerank or distance threshold)
Type int (≥ topK)
Default topK × 10
What it does Controls how many ANN candidates are retrieved before applying exact rerank or distance threshold refinement. A larger pool improves recall but increases latency.
When to tune Increase when using `vectorExactRerank=true` or `vectorDistanceThreshold` and recall is insufficient.

```sql
SET vectorExactRerank = true;
SET vectorMaxCandidates = 500;
SELECT productId, l2Distance(embedding, ARRAY[...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[...], 10)
ORDER BY dist ASC LIMIT 10
```

5. vectorDistanceThreshold — Distance cutoff filter

Applies to All backends
Type float
Default not set (pure topK mode)
What it does Adds a distance cutoff to topK results. After ANN candidate retrieval and optional rerank, only results within this distance are returned. The threshold is compared against the value computed by the configured distance function (EUCLIDEAN returns squared L2, COSINE returns 1 − cosine similarity, INNER_PRODUCT returns negated dot product).
When to tune When you want "top-K but only if close enough" — combines topK ranking with a quality gate. For pure radius search without a topK limit, use `VECTOR_SIMILARITY_RADIUS` SQL instead.

```sql
SET vectorDistanceThreshold = 0.5;
SELECT productId, l2Distance(embedding, ARRAY[...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[...], 100)
ORDER BY dist ASC LIMIT 100
```

Option compatibility matrix

Option HNSW IVF_FLAT IVF_PQ IVF_ON_DISK
`vectorNprobe` ignored
`vectorEfSearch` ignored ignored ignored
`vectorExactRerank` ✅ (default on)
`vectorMaxCandidates`
`vectorDistanceThreshold`

SQL Functions

VECTOR_SIMILARITY(column, vector, topK) — Top-K ANN search (existing)

```sql
SELECT productId, l2Distance(embedding, ARRAY[1.0, 2.0, ...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[1.0, 2.0, ...], 10)
ORDER BY dist ASC LIMIT 10
```

VECTOR_SIMILARITY_RADIUS(column, vector, threshold) — Radius search (new)

Returns all documents whose vector distance is within the threshold. Automatically falls back to brute-force scan if the ANN candidate pool is saturated, guaranteeing complete results.
```sql
SELECT productId, l2Distance(embedding, ARRAY[1.0, 2.0, ...]) AS dist
FROM products
WHERE VECTOR_SIMILARITY_RADIUS(embedding, ARRAY[1.0, 2.0, ...], 0.5)
ORDER BY dist ASC LIMIT 100
```

Compound: vector + filter (adaptive planner selects pre-filter vs post-filter)

```sql
SELECT productId, l2Distance(embedding, ARRAY[1.0, 2.0, ...]) AS dist
FROM products
WHERE vectorSimilarity(embedding, ARRAY[1.0, 2.0, ...], 50)
AND category = 'electronics'
ORDER BY dist ASC LIMIT 10
```


Best Practices

  1. Start with HNSW for general use — supports mutable segments and has good default recall
  2. Use IVF_FLAT for large offline segments where you want `vectorNprobe`-tunable recall/speed tradeoff
  3. Use IVF_ON_DISK when heap memory is constrained — same accuracy as IVF_FLAT, FileChannel-based
  4. Tune search effort first: increase `vectorEfSearch` (HNSW) or `vectorNprobe` (IVF) before enabling rerank
  5. Enable rerank for IVF_PQ: it's on by default — increase `vectorMaxCandidates` if recall is still insufficient
  6. Use VECTOR_SIMILARITY_RADIUS for "find all similar items within distance X" use cases
  7. Use vectorDistanceThreshold for "top-K but only if close enough" — quality gate on topK results
  8. Combine vector + filter for category-scoped search — the adaptive planner automatically selects FILTER_THEN_ANN for highly selective filters

Benchmark Results

50,000 vectors, 128 dimensions, EUCLIDEAN distance, 100 queries. Run on local dev machine.

IVF_FLAT nprobe Sweep

nprobe Latency (μs) Recall@10
1 43 0.046
2 58 0.085
4 158 0.152
8 314 0.251
16 596 0.410
32 1,148 0.625
64 2,197 0.854
128 4,092 1.000

Quantizer Comparison

Type Bytes/Vec Compression Recall@10
FLAT 512 1.000
SQ8 128 0.985
SQ4 64 0.874

Filter-Aware ANN (Pre-filter Selectivity Sweep)

Selectivity No-Filter (μs) Pre-Filter (μs) Speedup
100% 353 268 1.32×
50% 266 184 1.45×
20% 270 124 2.17×
10% 252 90 2.79×
1% 195 97 2.01×

Review Feedback Addressed

Maintainer review (Jackie-Jiang)

Comment Resolution
What is the difference between vectorThreshold and vectorDistanceThreshold? Removed `vectorThreshold` — `vectorDistanceThreshold` is the single distance cutoff option

Copilot review (17 comments) — all resolved

Issue Resolution
IvfOnDiskVectorIndexReader Javadoc says "mmap" Updated to "FileChannel-based random-access I/O"
scanInvertedList() heap allocation per call ThreadLocal ByteBuffer reuse + single float[] reuse
supportsFilterAwareSearch=false but implemented Set to true for all backends
HNSW_RELATIVE_SCORE_THRESHOLD is no-op Removed entirely
VectorSearchMetrics unused Wired into both vector operators
ScalarQuantizer.deserialize() unsafe Added bounds checks
mmapEnabled property is no-op Removed from properties
VectorSearchBenchmark runs in CI Added @test(enabled=false)
Benchmark timing divisor wrong Fixed to use actual iteration count
wirePreFilterForVectorOperators eager evaluation Added VectorSearchStrategy selectivity gate
VectorSearchStrategy unused Wired into FilterPlanNode

Codex adversarial review — all resolved

Issue Resolution
vectorEfSearch widens cardinality beyond top-K trimToTopK() clamps results to predicate top-K
VECTOR_SIMILARITY_RADIUS silently truncates at 100K Falls back to brute-force when ANN cap is hit
quantizer config accepted but unused Non-FLAT values rejected at validation

Test plan

  • `VectorBackendCapabilitiesTest` — 8 tests
  • `ScalarQuantizerTest` — 12 tests
  • `IvfFlatFilterAwareTest` — 7 tests
  • `VectorSearchStrategyTest` — 11 tests
  • `VectorSearchParamsTest` — 19 tests
  • `VectorSimilarityFilterOperatorTest` — 21 tests
  • `FilterAwareVectorSearchTest` — 12 tests
  • `VectorRadiusFilterOperatorTest` — 7 tests
  • `VectorCompoundQueryTest` — 9 tests
  • `FunctionRegistryTest` — 2 tests
  • `VectorPhase4Test` — 11 integration tests
  • `VectorTest` — 12 integration tests
  • `IvfFlatVectorTest` — 8 integration tests
  • `IvfPqVectorTest` — 6 integration tests
  • Backward compatibility verified
  • Spotless, checkstyle, license checks pass

🤖 Generated with Claude Code

@xiangfu0 xiangfu0 force-pushed the vector-phase-4 branch 4 times, most recently from 7672932 to a503c55 Compare April 7, 2026 22:19
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 38.46154% with 640 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.92%. Comparing base (2e80bff) to head (7b977b9).

Files with missing lines Patch % Lines
...dex/readers/vector/IvfOnDiskVectorIndexReader.java 0.00% 209 Missing ⚠️
...nt/index/readers/vector/HnswVectorIndexReader.java 1.42% 69 Missing ⚠️
...inot/core/operator/filter/VectorSearchMetrics.java 34.06% 58 Missing and 2 partials ⚠️
...re/operator/filter/VectorRadiusFilterOperator.java 52.08% 42 Missing and 4 partials ⚠️
...t/index/readers/vector/IvfPqVectorIndexReader.java 0.00% 46 Missing ⚠️
...ava/org/apache/pinot/core/plan/FilterPlanNode.java 16.98% 42 Missing and 2 partials ⚠️
...nt/local/segment/index/vector/ScalarQuantizer.java 78.57% 16 Missing and 14 partials ⚠️
...perator/filter/VectorSimilarityFilterOperator.java 64.00% 17 Missing and 10 partials ⚠️
...segment/spi/index/creator/VectorQuantizerType.java 0.00% 26 Missing ⚠️
...ment/local/segment/index/vector/FlatQuantizer.java 0.00% 21 Missing ⚠️
... and 14 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18119      +/-   ##
============================================
- Coverage     63.04%   62.92%   -0.13%     
- Complexity     1617     1621       +4     
============================================
  Files          3202     3213      +11     
  Lines        194718   195719    +1001     
  Branches      30047    30239     +192     
============================================
+ Hits         122760   123154     +394     
- Misses        62233    62769     +536     
- Partials       9725     9796      +71     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 62.89% <38.46%> (-0.12%) ⬇️
java-21 62.89% <38.46%> (-0.14%) ⬇️
temurin 62.92% <38.46%> (-0.13%) ⬇️
unittests 62.92% <38.46%> (-0.13%) ⬇️
unittests1 55.37% <25.67%> (-0.20%) ⬇️
unittests2 33.32% <13.26%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 requested review from Jackie-Jiang and Copilot April 8, 2026 01:58
@xiangfu0 xiangfu0 added vector Related to vector similarity search index Related to indexing (general) labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes “Phase 4” of Pinot’s vector search roadmap by adding filter-aware ANN execution, radius/threshold SQL support, HNSW efSearch runtime tuning, a quantizer SPI, an IVF_ON_DISK backend, and additional planning/explain plumbing across the query stack.

Changes:

  • Adds filter-aware ANN plumbing end-to-end (new reader interfaces, HNSW/IVF reader support, AND-node prefilter wiring, explain output).
  • Introduces radius/threshold vector filtering via VECTOR_SIMILARITY_RADIUS(...) (parser, predicate, operator, tests).
  • Adds new runtime query options (e.g., vectorEfSearch, vectorThreshold) plus quantizer framework types and an IVF_ON_DISK reader/backend.

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java Adds new query option keys for efSearch/threshold/HNSW relative score threshold.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/store/SegmentDirectoryPaths.java Resolves IVF_ON_DISK index file path using IVF_FLAT file extension.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/VectorQuantizer.java Adds quantizer SPI for encode/decode/distance on encoded vectors.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/FilterAwareVectorIndexReader.java Adds SPI for pre-filter ANN search with a bitmap constraint.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/EfSearchAware.java Adds SPI for query-scoped HNSW efSearch overrides.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorQuantizerType.java Adds enum for quantizer types (FLAT/SQ8/SQ4/PQ) + parsing helpers.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorIndexConfigValidator.java Validates new “quantizer” property and adds IVF_ON_DISK property set.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorBackendType.java Adds IVF_ON_DISK backend and updates supported type list / nprobe support.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/VectorSearchBenchmark.java Adds a micro-benchmark-style test for nprobe/quantizers/prefilter performance.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/ScalarQuantizerTest.java Adds unit tests for SQ8/SQ4 training, encode/decode, distance, serialization.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatFilterAwareTest.java Adds tests validating IVF_FLAT pre-filter ANN correctness across selectivities.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/VectorIndexType.java Wires IVF_ON_DISK reader/creator selection into segment-local index type factory.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/ScalarQuantizer.java Adds scalar quantizer implementation (SQ8/SQ4) with serialize/deserialize.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/FlatQuantizer.java Adds identity quantizer implementation that stores raw float32 bytes.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfPqVectorIndexReader.java Implements FilterAwareVectorIndexReader for IVF_PQ.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskVectorIndexReader.java Adds disk-backed IVF reader using FileChannel positional reads.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatVectorIndexReader.java Implements FilterAwareVectorIndexReader for IVF_FLAT.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/HnswVectorIndexReader.java Adds filter-aware HNSW via Lucene filtered Knn query + efSearch override support.
pinot-query-planner/src/main/java/org/apache/pinot/calcite/sql/fun/PinotOperatorTable.java Registers VECTOR_SIMILARITY_RADIUS SQL function for Calcite planning.
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/VectorPhase4Test.java Adds end-to-end integration coverage for efSearch/radius/compound filter queries.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/VectorSearchStrategyTest.java Adds tests for adaptive vector search strategy selection logic.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/VectorRadiusFilterOperatorTest.java Adds tests for radius operator (index-assisted + brute-force fallback).
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/FilterAwareVectorSearchTest.java Adds tests for filter-aware dispatch and explain-context extensions.
pinot-core/src/test/java/org/apache/pinot/core/function/FunctionRegistryTest.java Updates ignored filter kinds list to include VECTOR_SIMILARITY_RADIUS.
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java Wires pre-filter bitmaps into vector operators and adds radius predicate operator construction.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSimilarityFilterOperator.java Adds pre-filter ANN path, efSearch dispatch, threshold filtering, and explain enhancements.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSearchStrategy.java Introduces adaptive planner to choose EXACT_SCAN vs FILTER_THEN_ANN vs POST_FILTER_ANN.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSearchParams.java Extends query option parsing to include efSearch/threshold/HNSW relative score threshold.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSearchMode.java Adds enum describing how ANN interacts with filters (post-filter vs pre-filter vs scan).
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSearchMetrics.java Adds singleton for aggregating vector-search metrics (counters/budgets).
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorRadiusFilterOperator.java Adds operator implementing index-assisted radius filtering with exact forward-index refinement.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorExplainContext.java Extends explain context with efSearch/threshold/searchMode/selectivity/HNSW threshold fields.
pinot-common/src/test/java/org/apache/pinot/sql/parsers/CalciteSqlParserVectorRadiusTest.java Adds parser tests for VECTOR_SIMILARITY_RADIUS and compound vector predicates.
pinot-common/src/test/java/org/apache/pinot/common/request/context/predicate/VectorSimilarityRadiusPredicateTest.java Adds unit tests for VectorSimilarityRadiusPredicate behavior and equality.
pinot-common/src/main/java/org/apache/pinot/sql/parsers/rewriter/PredicateComparisonRewriter.java Adds operand validation for VECTOR_SIMILARITY_RADIUS.
pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java Adds filter validation for VECTOR_SIMILARITY_RADIUS signatures/literals.
pinot-common/src/main/java/org/apache/pinot/sql/FilterKind.java Adds VECTOR_SIMILARITY_RADIUS filter kind.
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java Adds parsing helpers for vectorEfSearch/vectorThreshold/HNSW relative score threshold.
pinot-common/src/main/java/org/apache/pinot/common/request/context/RequestContextUtils.java Builds VectorSimilarityRadiusPredicate from both thrift and function-context parsing paths.
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/VectorSimilarityRadiusPredicate.java Adds new predicate type for radius/threshold vector filtering.
pinot-common/src/main/java/org/apache/pinot/common/request/context/predicate/Predicate.java Adds VECTOR_SIMILARITY_RADIUS predicate type.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 41 out of 41 changed files in this pull request and generated 9 comments.

Comment thread pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 42 changed files in this pull request and generated 5 comments.

Comment on lines +185 to +198
/**
* Trims an oversampled bitmap to exactly topK entries. Lucene returns results ordered by
* score, and the collector appends them in that order, so keeping the first topK doc IDs
* from the bitmap's iterator preserves the best matches.
*/
private static MutableRoaringBitmap trimToTopK(MutableRoaringBitmap oversampled, int topK) {
MutableRoaringBitmap trimmed = new MutableRoaringBitmap();
org.roaringbitmap.IntIterator it = oversampled.getIntIterator();
int count = 0;
while (it.hasNext() && count < topK) {
trimmed.add(it.next());
count++;
}
return trimmed;
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trimToTopK() assumes the RoaringBitmap iterator preserves Lucene score order, but Roaring bitmaps iterate docIds in ascending numeric order (set semantics). When effectiveK > topK this will return an arbitrary subset (smallest docIds), not the topK nearest neighbors, breaking correctness for vectorEfSearch oversampling (and also for pre-filtered HNSW path). Consider collecting results in score order (e.g., via TopDocs/ScoreDoc list or an ordered int list) and trimming that ordered list before converting to a bitmap.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Removed the broken trimToTopK() approach entirely. efSearch is now an explain-only hint — Lucene's KnnFloatVectorQuery always returns exactly topK results (Lucene handles search quality internally via its beam width). No oversampling or trimming in the reader.

Comment on lines +940 to +943
/** Distance threshold for vector search. Only results within this threshold are returned.
* The threshold is compared against the distance value computed by the configured distance
* function: EUCLIDEAN returns sqrt(sum of squared diffs), COSINE returns 1 - cosine_similarity,
* and INNER_PRODUCT/DOT_PRODUCT returns the negated dot product. */
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VECTOR_THRESHOLD Javadoc claims EUCLIDEAN compares against sqrt(sum of squared diffs), but VectorFunctions.euclideanDistance() currently returns the squared L2 distance (no sqrt). This mismatch will confuse users configuring thresholds. Please update the comment to match the actual semantics (and/or reference l2Distance() for sqrt-L2).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Updated VECTOR_DISTANCE_THRESHOLD Javadoc to clarify that EUCLIDEAN/L2 uses squared L2 (sum of squared diffs, no sqrt), matching VectorFunctions.euclideanDistance() actual behavior.

Comment on lines 384 to 388
// Record search metrics for observability
VectorSearchMetrics.getInstance().recordSearch(_vectorSearchMode, _backendType);

return annResults;
} finally {
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VectorSearchMetrics.recordSearch() is only invoked on the fall-through path (after rerank/threshold checks). If the method returns early (approx threshold refinement, exact rerank, or vectorThreshold post-filter), metrics are never recorded, so production counters will systematically under-report usage. Consider recording metrics in a finally (or right before each return), and include the chosen _vectorSearchMode for all paths.

Suggested change
// Record search metrics for observability
VectorSearchMetrics.getInstance().recordSearch(_vectorSearchMode, _backendType);
return annResults;
} finally {
return annResults;
} finally {
// Record search metrics for observability on all execution paths.
VectorSearchMetrics.getInstance().recordSearch(_vectorSearchMode, _backendType);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Moved VectorSearchMetrics.recordSearch() to the finally block so it fires on every code path (threshold refinement, exact rerank, post-filter, and fall-through).

Comment on lines 299 to +327
try {
// 1. Configure backend-specific parameters via interfaces
configureBackendParams(column);
refreshExplainContext(null);
explainContext = _vectorExplainContext;

// 2. Determine effective search count (higher if rerank is enabled)
int searchCount = explainContext.getEffectiveSearchCount();

// 3. Execute ANN search
ImmutableRoaringBitmap annResults = _vectorIndexReader.getDocIds(queryVector, searchCount);
// 3. Execute ANN search (with pre-filter if available)
ImmutableRoaringBitmap preFilter = _preFilterBitmap;
ImmutableRoaringBitmap annResults;
if (preFilter != null && _vectorIndexReader instanceof FilterAwareVectorIndexReader) {
FilterAwareVectorIndexReader filterAwareReader = (FilterAwareVectorIndexReader) _vectorIndexReader;
if (filterAwareReader.supportsPreFilter()) {
_vectorSearchMode = VectorSearchMode.FILTER_THEN_ANN;
annResults = filterAwareReader.getDocIds(queryVector, searchCount, preFilter);
LOGGER.debug("Pre-filter ANN search on column: {}, filterCardinality: {}, filterSelectivity: {}",
column, preFilter.getCardinality(),
_numDocs > 0 ? (double) preFilter.getCardinality() / _numDocs : 0.0);
} else {
_vectorSearchMode = VectorSearchMode.POST_FILTER_ANN;
annResults = _vectorIndexReader.getDocIds(queryVector, searchCount);
}
} else {
_vectorSearchMode = preFilter != null
? VectorSearchMode.POST_FILTER_ANN : VectorSearchMode.POST_FILTER_ANN;
annResults = _vectorIndexReader.getDocIds(queryVector, searchCount);
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refreshExplainContext() is called (line 302) before _vectorSearchMode is set based on whether pre-filter is actually used. As a result, the VectorExplainContext created here will typically always carry POST_FILTER_ANN even when the operator executes FILTER_THEN_ANN, making explain/debug metadata inconsistent. Consider deciding _vectorSearchMode before refreshExplainContext(), or refreshing the explain context again after selecting the mode.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Added a second refreshExplainContext() call in the finally block, after _vectorSearchMode has been determined during execution. The explain context now always reflects the actual mode used.

Comment on lines 31 to 35
* immutable/offline segments only in phase 1.</li>
* <li>{@link #IVF_PQ} - Inverted File with product-quantized vectors. Supported for
* immutable/offline segments only in phase 2.</li>
* <li>{@link #IVF_ON_DISK} - Inverted File with disk-backed (mmap) vectors.</li>
* </ul>
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum-level Javadoc for IVF_ON_DISK still describes it as "disk-backed (mmap) vectors" (line 34), but the implementation and other docs indicate it uses FileChannel positional reads (no mmap). Please align the Javadoc to avoid implying memory-mapping semantics that are not actually used.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Updated enum-level Javadoc from 'disk-backed (mmap) vectors' to 'disk-backed vectors (FileChannel random reads)'.

*/
@Nullable
public static Float getVectorThreshold(Map<String, String> queryOptions) {
String threshold = queryOptions.get(QueryOptionKey.VECTOR_THRESHOLD);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between this threshold and the distance threshold?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — removed vectorThreshold entirely. The existing vectorDistanceThreshold (from Phase 3) already provides the same distance cutoff functionality. No need for a duplicate option.

@xiangfu0 xiangfu0 force-pushed the vector-phase-4 branch 3 times, most recently from 8d9aae2 to 219f2dc Compare April 9, 2026 04:04
@xiangfu0 xiangfu0 requested review from Jackie-Jiang and Copilot April 9, 2026 04:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 42 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/HnswVectorIndexReader.java:146

  • The query-scoped efSearch override is stored in _efSearchOverride, but it is never read/applied in either getDocIds() implementation. Both code paths always construct KnnFloatVectorQuery without using the override, so the vectorEfSearch query option becomes a no-op at runtime. Either wire efSearch into Lucene’s kNN search (if supported by the Lucene version in use), or reject/ignore the query option explicitly and align VectorBackendType.supportsRuntimeSearchParams/PR docs accordingly.
  @Override
  public void setEfSearch(int efSearch) {
    if (efSearch < 1) {
      throw new IllegalArgumentException("efSearch must be >= 1, got: " + efSearch);
    }
    _efSearchOverride.set(efSearch);
  }

  @Override
  public void clearEfSearch() {
    _efSearchOverride.remove();
  }

  /**
   * Returns the efSearch value for debug/explain output, or 0 if not set.
   */
  int getEffectiveEfSearch() {
    Integer efSearch = _efSearchOverride.get();
    return efSearch != null ? efSearch : 0;
  }

  @Override
  public MutableRoaringBitmap getDocIds(float[] searchQuery, int topK) {
    MutableRoaringBitmap docIds = new MutableRoaringBitmap();
    Collector docIDCollector = new HnswDocIdCollector(docIds, _docIdTranslator);
    try {
      // Lucene Query Parser is JavaCC based. It is stateful and should
      // be instantiated per query. Analyzer on the other hand is stateless
      // and can be created upfront.
      QueryParser parser = new QueryParser(_column, null);

      if (_useANDForMultiTermQueries) {
        parser.setDefaultOperator(QueryParser.Operator.AND);
      }
      KnnFloatVectorQuery knnFloatVectorQuery = new KnnFloatVectorQuery(_column, searchQuery, topK);
      _indexSearcher.search(knnFloatVectorQuery, docIDCollector);
      return docIds;

pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorBackendType.java:125

  • VectorBackendType declares HNSW supportsRuntimeSearchParams(false), but the Phase 4 changes add a runtime HNSW param (vectorEfSearch) that is dispatched via EfSearchAware in VectorSimilarityFilterOperator. This capability flag should be updated to reflect actual runtime tuning support (or, if efSearch remains unsupported/no-op, the query option should be rejected/ignored consistently).

Comment on lines +168 to +171
case IVF_ON_DISK:
// IVF_ON_DISK reuses the IVF_FLAT file format but loads via mmap instead of heap
return findFlatVectorIndexFile(segmentIndexDir, column,
V1Constants.Indexes.VECTOR_IVF_FLAT_INDEX_FILE_EXTENSION);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IVF_ON_DISK case comment still says the reader “loads via mmap instead of heap”, but IVF_ON_DISK is implemented via FileChannel positional reads (no mmap). Please update this comment to match the actual access pattern (e.g., “FileChannel random-access reads” / “on-demand reads”), to avoid reintroducing the previously removed mmap terminology.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Updated comment from 'loads via mmap instead of heap' to 'FileChannel random-access reads'.

Comment on lines +497 to +557
private void wirePreFilterForVectorOperators(List<BaseFilterOperator> childOperators, int numDocs) {
if (childOperators.size() < 2) {
return;
}

// Find vector similarity operators that support pre-filtering
List<VectorSimilarityFilterOperator> vectorOps = new ArrayList<>();
List<BaseFilterOperator> nonVectorOps = new ArrayList<>();
for (BaseFilterOperator op : childOperators) {
if (op instanceof VectorSimilarityFilterOperator) {
vectorOps.add((VectorSimilarityFilterOperator) op);
} else {
nonVectorOps.add(op);
}
}

if (vectorOps.isEmpty() || nonVectorOps.isEmpty()) {
return;
}

// Evaluate non-vector filters and combine their bitmaps to produce a pre-filter.
// Only do this if the non-vector operators can produce bitmaps efficiently.
boolean allCanProduceBitmaps = true;
for (BaseFilterOperator op : nonVectorOps) {
if (!op.canProduceBitmaps()) {
allCanProduceBitmaps = false;
break;
}
}

if (!allCanProduceBitmaps) {
return;
}

// Combine non-vector filter bitmaps via AND
MutableRoaringBitmap combinedBitmap = null;
for (BaseFilterOperator op : nonVectorOps) {
BitmapCollection bitmapCollection = op.getBitmaps();
org.roaringbitmap.buffer.ImmutableRoaringBitmap reduced = bitmapCollection.reduce();
if (combinedBitmap == null) {
combinedBitmap = reduced.toMutableRoaringBitmap();
} else {
combinedBitmap.and(reduced);
}
}

if (combinedBitmap == null || combinedBitmap.isEmpty()) {
return;
}

// Use VectorSearchStrategy to decide whether pre-filtering is worthwhile based on
// the estimated selectivity. Only pass the bitmap if the strategy recommends
// FILTER_THEN_ANN; otherwise fall back to the default post-filter path.
int estimatedFilteredDocs = combinedBitmap.getCardinality();
VectorSearchStrategy.Decision decision = VectorSearchStrategy.decide(
numDocs, estimatedFilteredDocs,
/* hasVectorIndex= */ true,
/* indexSupportsPreFilter= */ true,
/* isMutableSegment= */ false,
/* backendType= */ null,
/* searchParams= */ null);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wirePreFilterForVectorOperators() always assumes indexSupportsPreFilter=true when calling VectorSearchStrategy.decide(), and it eagerly evaluates/ANDs all non-vector bitmaps before that decision. If the actual VectorSimilarityFilterOperator’s underlying reader can’t use pre-filtering (e.g., not a FilterAwareVectorIndexReader, or supportsPreFilter() returns false), this work becomes pure overhead and the query will still execute in POST_FILTER_ANN. Consider determining pre-filter capability from the vector operator/reader (or adding an accessor on VectorSimilarityFilterOperator) and returning early before bitmap materialization when pre-filter isn’t actually usable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Added supportsPreFilter() accessor on VectorSimilarityFilterOperator. wirePreFilterForVectorOperators() now checks this before materializing non-vector bitmaps — returns early if no vector operator actually supports pre-filtering. Also only passes the bitmap to operators that support it.

recording.setColumnName(_column);
recording.setFilter(FilterType.INDEX, "VECTOR_SIMILARITY_RADIUS");
recording.setInputDataType(FieldSpec.DataType.FLOAT, false);
recording.setNumDocsMatchingAfterFilter(matches.getCardinality());
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

record() sets recording.setNumDocsMatchingAfterFilter(...) twice (same value). This looks like an accidental duplication and can be removed to keep tracing instrumentation clean and avoid confusion when this code is modified later.

Suggested change
recording.setNumDocsMatchingAfterFilter(matches.getCardinality());

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: Removed the duplicate setNumDocsMatchingAfterFilter() call.

@xiangfu0 xiangfu0 force-pushed the vector-phase-4 branch 2 times, most recently from 65d186b to 52f0b9e Compare April 9, 2026 06:37
@xiangfu0 xiangfu0 force-pushed the vector-phase-4 branch 2 times, most recently from d51c9bb to edd1985 Compare April 9, 2026 07:02
…h, quantizers, IVF_ON_DISK, adaptive planner

This completes the vector search roadmap with seven feature areas:

1. Filter-aware ANN (FILTER_THEN_ANN): Pre-filter bitmap passed to HNSW/IVF
   backends for improved recall on selective filters
2. SQL surface: VECTOR_SIMILARITY_RADIUS for threshold/radius search
3. HNSW runtime tuning: vectorEfSearch query option via EfSearchAware interface
4. Generic quantizer framework: VectorQuantizerType (FLAT/SQ8/SQ4/PQ),
   ScalarQuantizer with train/encode/decode/serialize
5. IVF_ON_DISK: Disk-backed IVF via FileChannel reads (no 2GB limit)
6. Adaptive planner: VectorSearchStrategy with selectivity-aware mode selection
7. Metrics: VectorSearchMetrics singleton for observability

All existing configs, query options, and SQL are backward-compatible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ervability comments

- Fix CRITICAL: when both vectorExactRerank=true and vectorDistanceThreshold are set, the
  threshold-only branch fired first and returned early, bypassing exact rerank entirely.
  Now exact rerank takes priority and applies the threshold during the rerank step.
- Remove dead code: second threshold block (lines 382-390) that was unreachable because
  the first threshold branch always returned before it was reached.
- Remove duplicate setNumDocsMatchingAfterFilter call in record() (was called twice).
- Add LOGGER.warn when vectorEfSearch is set, making it visible that the option currently
  only affects EXPLAIN output and does not change Lucene graph traversal.
- Update VectorSimilarityFilterOperator Javadoc to list all 4 backends (was HNSW+IVF_FLAT only).
- Add comment in FilterPlanNode explaining why backendType/searchParams are null at the
  pre-filter wiring stage.
- Add comment in VectorIndexType noting IVF_ON_DISK reuses the IVF_FLAT file extension.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@xiangfu0 xiangfu0 merged commit 85e2083 into apache:master Apr 10, 2026
16 checks passed
xiangfu0 pushed a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 10, 2026
- Filter-aware ANN (FILTER_THEN_ANN) with pre-filter bitmap support
- VECTOR_SIMILARITY_RADIUS SQL function for radius/distance search
- vectorEfSearch query option for HNSW runtime tuning
- IVF_ON_DISK disk-backed index type
- Adaptive planner (VectorSearchStrategy) with selectivity-aware mode selection
- Quantizer framework (FLAT/SQ8/SQ4/PQ)
- VectorSearchMetrics for production observability

Upstream PR: apache/pinot#18119
@xiangfu0
Copy link
Copy Markdown
Contributor Author

📝 Documentation update PR opened for this feature: pinot-contrib/pinot-docs#730

This PR documents all Phase 4 features:

  • Filter-aware ANN (FILTER_THEN_ANN)
  • VECTOR_SIMILARITY_RADIUS SQL function
  • vectorEfSearch query option for HNSW runtime tuning
  • IVF_ON_DISK disk-backed index type
  • Adaptive query planner (VectorSearchStrategy)
  • Quantizer framework (FLAT/SQ8/SQ4/PQ)
  • VectorSearchMetrics for production observability

xiangfu0 added a commit to pinot-contrib/pinot-docs that referenced this pull request Apr 10, 2026
Adds comprehensive documentation for vector search Phase 4 features from
apache/pinot#18119.

## Features Documented

- **Filter-aware ANN (FILTER_THEN_ANN)**: Pre-filter bitmap passed to
HNSW/IVF backends via FilterAwareVectorIndexReader for improved recall
on selective filters
- **VECTOR_SIMILARITY_RADIUS SQL function**: Distance-based filtering
without fixed top-K limit, with automatic brute-force fallback
- **HNSW vectorEfSearch query option**: Runtime tuning of search beam
width without index rebuild via EfSearchAware interface
- **Generic quantizer framework**: VectorQuantizerType enum
(FLAT/SQ8/SQ4/PQ) with ScalarQuantizer train/encode/decode/serialize
- **IVF_ON_DISK index type**: Disk-backed IVF using FileChannel
random-access reads with ThreadLocal buffer reuse
- **Adaptive query planner (VectorSearchStrategy)**: Selectivity-aware
mode selection wired into FilterPlanNode
- **Vector search metrics (VectorSearchMetrics)**: Production
observability singleton tracking ANN candidates, reranking, filtering,
and latency

## Documentation Updates

Updated `/build-with-pinot/indexing/vector-index.md` with:
- Index configuration for all types (HNSW, IVF_FLAT, IVF_PQ,
IVF_ON_DISK)
- Quantizer types and scalar quantization examples (SQ8, SQ4)
- SQL syntax for VECTOR_SIMILARITY_RADIUS()
- Filter-aware ANN usage patterns
- HNSW runtime tuning with vectorEfSearch
- Adaptive planner behavior and selectivity-based mode selection
- Vector search metrics reference
- Complete end-to-end semantic search example
- Query options reference for vector-specific settings

Upstream PR: apache/pinot#18119

Co-authored-by: Pinot Docs Bot <xiang@pinot-docs-bot.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

index Related to indexing (general) vector Related to vector similarity search

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants