Skip to content

perf(fts): add block-max skip and score early-exit to ConjunctionIterator#441

Open
egolearner wants to merge 1 commit into
alibaba:mainfrom
egolearner:feat/fts-conjunction-block-max-pruning
Open

perf(fts): add block-max skip and score early-exit to ConjunctionIterator#441
egolearner wants to merge 1 commit into
alibaba:mainfrom
egolearner:feat/fts-conjunction-block-max-pruning

Conversation

@egolearner

Copy link
Copy Markdown
Collaborator
  • Block-max skip: skip entire non-competitive blocks (128 docs) in do_next() by checking block_max_info_for() score upper bounds
  • Score early-exit: short-circuit score() accumulation when remaining upper bound cannot beat the threshold
  • Phrase forwarding: PhraseDocIterator propagates min_competitive_score to its inner conjunction, enabling the above optimizations for phrase queries
  • optIsRequired: when must-only block_max < threshold, promote should clauses to required so docs without any should match are skipped
  • DisjunctionIterator::advance() bypasses the WAND loop to avoid incorrectly pruning target docs when used as a should clause

Benchmark (500k docs, quora dataset, BitPacked mode):
AND queries: 22-38% faster (topk=10: 3.65s→2.86s, topk=3: 3.63s→2.26s)
Phrase queries: 33% faster (205s→137s)

- Block-max skip: skip entire non-competitive blocks (128 docs) in
  do_next() by checking block_max_info_for() score upper bounds
- Score early-exit: short-circuit score() accumulation when remaining
  upper bound cannot beat the threshold
- Phrase forwarding: PhraseDocIterator propagates min_competitive_score
  to its inner conjunction, enabling the above optimizations for phrase
  queries
- optIsRequired: when must-only block_max < threshold, promote should
  clauses to required so docs without any should match are skipped
- DisjunctionIterator::advance() bypasses the WAND loop to avoid
  incorrectly pruning target docs when used as a should clause

Benchmark (500k docs, quora dataset, BitPacked mode):
  AND queries: 22-38% faster (topk=10: 3.65s→2.86s, topk=3: 3.63s→2.26s)
  Phrase queries: 33% faster (205s→137s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant