Skip to content

feat(EvoAgentBench): add OpenAI-compatible embedding API for BrowseComp-Plus FAISS search#235

Open
CZH-THU wants to merge 1 commit into
EverMind-AI:mainfrom
CZH-THU:embed_api
Open

feat(EvoAgentBench): add OpenAI-compatible embedding API for BrowseComp-Plus FAISS search#235
CZH-THU wants to merge 1 commit into
EverMind-AI:mainfrom
CZH-THU:embed_api

Conversation

@CZH-THU
Copy link
Copy Markdown

@CZH-THU CZH-THU commented May 30, 2026

Summary

BrowseComp-Plus previously loaded embedding models locally (HuggingFace + tevatron DenseModel), which requires significant GPU memory. This change adds optional OpenAI-compatible embedding API support so query vectors can be fetched from external services (vLLM, sglang, ModelScope, etc.) while keeping the FAISS index on CPU.

Changes:

  • faiss_searcher.py: When embedding.api_base is configured, skip local model loading and call /v1/embeddings via the OpenAI client. Query text still uses the BrowseComp-Plus task_prefix for index compatibility. Includes ${ENV_VAR} resolution for secrets and _maybe_l2_normalize() to skip re-normalization when the API already returns unit vectors. Uses encoding_format="float" for ModelScope compatibility.
  • start_mcp.py: Passes embedding.* fields from information_retrieval.yaml to MCP server CLI args.
  • information_retrieval.yaml: Adds commented example config for the embedding API block.

Local model loading remains the default when embedding.api_base is not set.

Area

  • Architecture method
  • Benchmark
  • Use case
  • Documentation
  • Developer experience
  • CI, build, or release

Verification

# Code review: verified parameter flow start_mcp.py → mcp_server.py → FaissSearcher.parse_args()
# Verified OpenAI client call matches ModelScope embedding API example (base_url, model, encoding_format="float")
# Verified task_prefix is applied in API path (required for qwen3-embedding-8b index compatibility)
# Not run: full end-to-end benchmark (requires ModelScope token / local FAISS index / MCP server)

Checklist

  • I kept the change scoped to the relevant area.
  • I updated docs, examples, or setup notes when behavior changed.
  • I added or updated tests when the change affects behavior.
  • I did not commit secrets, .env files, dependency folders, or generated output.
  • Active relative links in Markdown files resolve.

Notes for Reviewers

  • Embedding logic lives in the MCP/FAISS layer (faiss_searcher.py), not in browsecomp_plus.py — the domain adapter only starts the MCP server via start_mcp.py.
  • normalize must match how the FAISS index was built; default false aligns with the official qwen3-embedding-8b index. When normalize: true, already-normalized API responses are detected (L2 norm ≈ 1) and left unchanged.
  • API model name must match the index (e.g. Qwen/Qwen3-Embedding-8B).
  • Only searcher_type: faiss supports these new args; bm25 / custom searchers are unaffected.

Example config:

mcp_server:
  model_name: Qwen/Qwen3-Embedding-8B
  embedding:
    api_base: https://api-inference.modelscope.cn/v1
    api_key: ${EMBEDDING_API_KEY}
    model: Qwen/Qwen3-Embedding-8B
    normalize: false

…ISS search

Allow query embeddings to be fetched via OpenAI-compatible APIs (vLLM, ModelScope, etc.) instead of loading models locally, with optional L2 normalization that skips already-normalized vectors.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant