Skip to content

Add cuvs-bench-elastic: HTTP backend for Elasticsearch GPU vector search#1907

Draft
afourniernv wants to merge 8 commits intorapidsai:mainfrom
afourniernv:fea-1856-cuvs-lucene-backend
Draft

Add cuvs-bench-elastic: HTTP backend for Elasticsearch GPU vector search#1907
afourniernv wants to merge 8 commits intorapidsai:mainfrom
afourniernv:fea-1856-cuvs-lucene-backend

Conversation

@afourniernv
Copy link
Copy Markdown

Introduce cuvs-bench-elastic as an optional plugin for cuvs-bench that provides an Elasticsearch backend. The backend communicates with Elasticsearch via HTTP and supports HNSW indexing with optional GPU acceleration when using the Elasticsearch GPU image (cuVS-accelerated vector search).

  • Add cuvs_bench_elastic package with backend and config loader entry points
  • Extend cuvs_bench registry and search spaces for pluggable backends
  • Add elastic and integration optional dependencies to cuvs-bench
  • Add modularization tests and integration test scaffolding (disabled until CI has ES GPU image, cuVS libs, and GPU runner)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Mar 10, 2026
@cjnolet cjnolet moved this to In Progress in Unstructured Data Processing Mar 10, 2026
ep.load()()
except ImportError as e:
if "elasticsearch" in str(e).lower() or "elasticsearch" in str(e):
raise ImportError(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This is what I discussed with @cjnolet pertaining to lazy import and used Milvus as an example

class MilvusBackend(BenchmarkBackend):
    def __init__(self, config: Dict[str, Any]):
        super().__init__(config)
        try:
            from pymilvus import connections, Collection
        except ImportError:
            raise ImportError(
                "pymilvus is required for MilvusBackend. "
                "Install with: pip install pymilvus"
            )
        connections.connect(host=config["host"], port=config["port"])

Introduce cuvs-bench-elastic as an optional plugin for cuvs-bench that
provides an Elasticsearch backend. The backend communicates with
Elasticsearch via HTTP and supports HNSW indexing with optional GPU
acceleration when using the Elasticsearch GPU image (cuVS-accelerated
vector search).

- Add cuvs_bench_elastic package with backend and config loader entry points
- Extend cuvs_bench registry and search spaces for pluggable backends
- Add elastic and integration optional dependencies to cuvs-bench
- Add modularization tests and integration test scaffolding (disabled until
  CI has ES GPU image, cuVS libs, and GPU runner)

Signed-off-by: Alex Fournier <afournier@nvidia.com>
Use single-doc format (_index, _id, vector_field) instead of two-part
NDJSON (index action + source) so ES accepts the bulk request.

Signed-off-by: Alex Fournier <afournier@nvidia.com>
Expose ELASTIC constant and convenience wrappers for build-only,
search-only, or full benchmark runs.

Signed-off-by: Alex Fournier <afournier@nvidia.com>
@afourniernv afourniernv force-pushed the fea-1856-cuvs-lucene-backend branch from ec0b3e3 to 7b25521 Compare March 20, 2026 16:45
- Document run_build, run_search, run_benchmark convenience API
- Document ELASTIC constant and orchestrator usage
- Add username/password support in config loader (converts to basic_auth)

Signed-off-by: Alex Fournier <afournier@nvidia.com>


@pytest.fixture(scope="module")
def elasticsearch_container():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a container? We should just be able to assume an elasticsearch cluster is ready to accept requests, right? cuvs-bench doesn't need to be self-contained, just be able to send the proper requests to an existing elasticsearch cluster. Or am I missing a big detail here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just seems unnecessary.

class ElasticBackend(BenchmarkBackend):
"""Elasticsearch GPU backend for vector benchmarking."""

def __init__(self, config: Dict[str, Any]):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see- is the container just for the integration tests?

@@ -0,0 +1,50 @@
# SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION.
# SPDX-License-Identifier: Apache-2.0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have different packages deployed for every backend. This significantly adds to the maintenance burden as every new package now needs to be versioned, deloyed, and audited for dependency trails. Let's consolidate this into the existing cuvs-bench package. All dependencies on anything elasticsearch or cuvs-lucene should be soft dependendencies in Python (that is, they should test if they can import the package and if they can't, throw a warning and fail gracefully).

- New `cuvs_bench_elastic` package with HTTP backend for Elasticsearch GPU
  vector search (HNSW, int8_hnsw, int4_hnsw, bbq_hnsw index types)
- Supports `pip install cuvs-bench[elastic]` without a separate PyPI
  publish: `cuvs_bench` bundles the plugin via setuptools packages.find
- Plugin registers via entry points (`cuvs_bench.backends` /
  `cuvs_bench.config_loaders`) — no changes to core cuvs-bench required
- `ElasticConfigLoader` reads shared `datasets.yaml` from cuvs_bench and
  `elastic.yaml` from the plugin config; supports sweep and tune modes
- `build()` checks index existence before file validation so `force=False`
  returns immediately without requiring the base file on disk
- Removed testcontainers-based integration tests; added unit tests for
  pre-flight failure, force=False skip, dry-run, helper functions
- `elasticsearch` client is an optional dep (`cuvs-bench[elastic]` extra)
@afourniernv afourniernv force-pushed the fea-1856-cuvs-lucene-backend branch from dca9c6f to 087289d Compare April 13, 2026 01:46
The separate cuvs_bench_elastic package required bundling via
packages.find and complicated the build. Instead, keep the backend
inside cuvs_bench and use entry points pointing back into the same
package so the backend only registers when elasticsearch is installed.

- git mv backend.py to cuvs_bench/backends/elasticsearch.py
- git mv elastic.yaml to cuvs_bench/config/algos/
- Fix imports to relative paths
- Fix _get_elastic_config_path() to use ../config from backends/
- Update pyproject.toml: entry points -> cuvs_bench.backends.elasticsearch:register
- Remove packages.find (no longer needed)
- Remove cuvs_bench_elastic/ package entirely

DX unchanged: pip install cuvs-bench[elastic]
One package, one publish pipeline.

Signed-off-by: Alex Fournier <afournier@nvidia.com>
…ch.py

Restores the high-level API that was previously in cuvs_bench_elastic/__init__.py
so existing demo scripts continue to work after the module consolidation.

Signed-off-by: Alex Fournier <afournier@nvidia.com>
)


def _load_fbin(path: Path) -> np.ndarray:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we centralize these funtions or used existing centralized functions? Really don't want the file formats we support to diverge across different backends, so we should have 1 way to load the datasets.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Install with: pip install cuvs-bench[elastic]"
) from e
host = self.config.get("host", "localhost")
port = self.config.get("port", 9200)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to confirm- are there any other protocols to interact with Elasticsearch or is HTTP/Rest pretty much it? Mostly asking to make sure it's the best (fastest) way to comunicate so we aren't adding additional overheads in the mix.

index_type = build_params.get("type", _DEFAULT_INDEX_TYPE)
m = build_params.get("m", _DEFAULT_M)
ef_construction = build_params.get(
"ef_construction", _DEFAULT_EF_CONSTRUCTION
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should decouple the index type from the backend type. User should be able to specify the backend and the algorithm they want to run in that backend (or maybe the two are coupled, but the backend should not assume the algorithm).

So for example, what if the index type is not supported? What happens when a user wants to test diskbbq against hnsw/cagra? We should decouple these from the beginning, ideally with separate functions for each (for e.g. _parse_cagra_params(), _parse_hnsw_params())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also important that we throw a proper (descriptive) error when the backend doesn't support an index type. For example, if someone passes in "SCaNN" to the Elasticsearch backend, they should get a message to the effect of "Received params for SCaNN index type, but Elasticsearch backend does not support this index type. Please check configuration."

"""Run kNN search over all search-param combinations and compute recall."""
if dry_run:
return SearchResult(
neighbors=np.zeros((0, k), dtype=np.int64),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using np.empty() for all of these.

# SPDX-License-Identifier: Apache-2.0
#
"""
Smoke tests for cuvs-bench modularization (optional deps, entry points, lazy loading).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to see improved test coverage!

@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented Apr 16, 2026

@afourniernv this is a really big PR so I'm reviewing in stages. Thank you for bearing with me. I finally had some time to look through the implementation for comprehensively. Mostly minor, but important things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants