Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds complete full-text search feature support to VectorDBBench, enabling the benchmarking tool to evaluate Milvus's BM25 full-text search performance. This feature is based on the MS MARCO dataset and supports test cases of various scales.
Main Achievements
Adapted 3 FTS Performance Test Cases: Support for 100K, 5M, and 8.8M scale MS MARCO datasets. Only submitting 100K version here.
Complete FTS Dataset Management: Support for reading, parsing, and batch processing of TSV format data files
Milvus FTS Client Integration: Implement full-text document insertion and BM25 search functionality.
FTS-Specific Evaluation Metrics: Added calculation of Recall@K, NDCG@K, MRR and other metrics
Frontend Interface Support: Added FTS test case configuration and parameter settings in the Web UI
Core Features
1. Dataset Support
FtsDatasetManagerto manage MS MARCO datasets2. Milvus FTS Integration
insert_fulltext()method: Support batch insertion of full-text documentssearch_fulltext()method: Full-text search based on BM25 algorithm3. Test Execution Engine
SerialFtsInsertRunner: FTS document insertion executorSerialSearchRunnerandMultiProcessingSearchRunner: Support FTS search testing4. Evaluation System
calc_recall_fts()andcalc_ndcg_fts()functions