feat(EvoAgentBench): add OpenAI-compatible embedding API for BrowseComp-Plus FAISS search by CZH-THU · Pull Request #235 · EverMind-AI/EverOS

CZH-THU · 2026-05-30T07:47:03Z

Summary

BrowseComp-Plus previously loaded embedding models locally (HuggingFace + tevatron DenseModel), which requires significant GPU memory. This change adds optional OpenAI-compatible embedding API support so query vectors can be fetched from external services (vLLM, sglang, ModelScope, etc.) while keeping the FAISS index on CPU.

Changes:

faiss_searcher.py: When embedding.api_base is configured, skip local model loading and call /v1/embeddings via the OpenAI client. Query text still uses the BrowseComp-Plus task_prefix for index compatibility. Includes ${ENV_VAR} resolution for secrets and _maybe_l2_normalize() to skip re-normalization when the API already returns unit vectors. Uses encoding_format="float" for ModelScope compatibility.
start_mcp.py: Passes embedding.* fields from information_retrieval.yaml to MCP server CLI args.
information_retrieval.yaml: Adds commented example config for the embedding API block.

Local model loading remains the default when embedding.api_base is not set.

Area

Verification

# Code review: verified parameter flow start_mcp.py → mcp_server.py → FaissSearcher.parse_args()
# Verified OpenAI client call matches ModelScope embedding API example (base_url, model, encoding_format="float")
# Verified task_prefix is applied in API path (required for qwen3-embedding-8b index compatibility)
# Not run: full end-to-end benchmark (requires ModelScope token / local FAISS index / MCP server)

Checklist

I kept the change scoped to the relevant area.
I updated docs, examples, or setup notes when behavior changed.
I added or updated tests when the change affects behavior.
I did not commit secrets, .env files, dependency folders, or generated output.
Active relative links in Markdown files resolve.

Notes for Reviewers

Embedding logic lives in the MCP/FAISS layer (faiss_searcher.py), not in browsecomp_plus.py — the domain adapter only starts the MCP server via start_mcp.py.
normalize must match how the FAISS index was built; default false aligns with the official qwen3-embedding-8b index. When normalize: true, already-normalized API responses are detected (L2 norm ≈ 1) and left unchanged.
API model name must match the index (e.g. Qwen/Qwen3-Embedding-8B).
Only searcher_type: faiss supports these new args; bm25 / custom searchers are unaffected.

Example config:

mcp_server:
  model_name: Qwen/Qwen3-Embedding-8B
  embedding:
    api_base: https://api-inference.modelscope.cn/v1
    api_key: ${EMBEDDING_API_KEY}
    model: Qwen/Qwen3-Embedding-8B
    normalize: false

…ISS search Allow query embeddings to be fetched via OpenAI-compatible APIs (vLLM, ModelScope, etc.) instead of loading models locally, with optional L2 normalization that skips already-normalized vectors. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions Bot mentioned this pull request May 30, 2026

[watch] Overnight fork patrol: 2026-05-30 Fearvox/EverOS#88

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(EvoAgentBench): add OpenAI-compatible embedding API for BrowseComp-Plus FAISS search#235

feat(EvoAgentBench): add OpenAI-compatible embedding API for BrowseComp-Plus FAISS search#235
CZH-THU wants to merge 1 commit into
EverMind-AI:mainfrom
CZH-THU:embed_api

CZH-THU commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CZH-THU commented May 30, 2026

Summary

Area

Verification

Checklist

Notes for Reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant