From 6632d9dc2a9124982afb3545c59e4500c7e48f1c Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 09:36:17 -0700
Subject: [PATCH 1/8] chore: add .github/copilot-instructions.md for project
 conventions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md | 78 +++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)
 create mode 100644 .github/copilot-instructions.md

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
new file mode 100644
index 0000000..bdc2ae9
--- /dev/null
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,78 @@
+# DocumentDB Samples — Copilot Instructions
+
+## Project Overview
+Azure DocumentDB code samples for vector search and algorithm selection quickstart articles.
+
+## Language Dependencies
+
+### Go
+- Go 1.21+
+- go.mongodb.org/mongo-driver v1.17+
+- github.com/Azure/azure-sdk-for-go/sdk/azidentity
+- github.com/Azure/azure-sdk-for-go/sdk/azcore
+
+### Java
+- Java 17+
+- MongoDB Driver 5.3+
+- Azure Identity 1.15+
+- Maven 3.8+
+
+### Python
+- Python 3.10+
+- pymongo >= 4.7
+- azure-identity
+- openai
+
+### TypeScript/Node.js
+- Node.js 20+
+- mongodb 6.12+
+- @azure/identity
+- openai
+
+### .NET
+- .NET 8+
+- MongoDB.Driver 3.2+
+- Azure.Identity
+
+## Consistent Variable Values
+
+All samples MUST use these environment variable names and defaults:
+
+| Variable | Default | Purpose |
+|----------|---------|---------|
+| MONGO_CLUSTER_NAME | (required) | DocumentDB cluster name |
+| AZURE_OPENAI_EMBEDDING_ENDPOINT | (required) | Azure OpenAI endpoint |
+| AZURE_OPENAI_EMBEDDING_MODEL | (required) | Embedding model deployment |
+| DATA_FILE_WITH_VECTORS | ./Hotels_Vector.json | Path to data file |
+| EMBEDDED_FIELD | DescriptionVector | Vector field name in documents |
+| EMBEDDING_DIMENSIONS | 1536 | Vector dimensions |
+| LOAD_SIZE_BATCH | 100 | Batch size for document insertion |
+| EMBEDDING_SIZE_BATCH | 16 | Batch size for embedding generation |
+| AZURE_DOCUMENTDB_DATABASENAME | Hotels | Database name |
+| SIMILARITY | (varies) | Similarity metric (cosine, euclidean, ip) |
+| ALGORITHM | (varies) | Algorithm (ivf, hnsw, diskann) |
+
+## Consistent Algorithm Parameters
+
+### IVF
+- numLists: 1
+- nProbes: 1
+
+### HNSW
+- m: 16
+- efConstruction: 64
+- efSearch: 40
+
+### DiskANN
+- maxDegree: 20
+- lBuild: 10
+- lSearch: 40
+
+## Rules
+
+1. **No Cosmos DB references.** Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com".
+2. **Vector field name is DescriptionVector.** Never default to "contentVector".
+3. **Data file is shared.** All samples reference `../data/Hotels_Vector.json`. READMEs instruct users to copy it locally.
+4. **Batch size is LOAD_SIZE_BATCH=100.** Do not use BATCH_SIZE or other variants.
+5. **Database name variable is AZURE_DOCUMENTDB_DATABASENAME.** Do not use MONGO_DB_NAME or other variants.
+6. **.NET uses appsettings.json** with same variable names under a "DocumentDB" section.

From dcaa7f70566c7221a9ab89efe6f78eb23c58b968 Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 09:54:29 -0700
Subject: [PATCH 2/8] fix: update copilot-instructions with missing deps,
 correct data path, and code patterns

- Add OpenAI SDK dependencies for Go, Java, and .NET
- Add python-dotenv and godotenv dependencies
- Fix DATA_FILE_WITH_VECTORS default from ./Hotels_Vector.json to ../data/Hotels_Vector.json
- Add AZURE_OPENAI_EMBEDDING_API_VERSION and MONGO_CONNECTION_STRING to env var table
- Add Authentication section documenting passwordless (OIDC) and connection string auth
- Add Sample Execution Pattern section with consistent lifecycle, naming conventions,
  standard search query, and vector search pipeline structure
- Add Repository Structure overview
- Clarify Rule 1 exceptions (mongocluster.cosmos.azure.com, cosmosSearch, VS Code extension)
- Fix Rule 3 to reference env var with shared data path default
- Fix Rule 6 to list actual .NET config sections
- Add rules for COS similarity, output files, and index type availability

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md | 99 +++++++++++++++++++++++++++++----
 1 file changed, 89 insertions(+), 10 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index bdc2ae9..b8a4db8 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -3,6 +3,26 @@
 ## Project Overview
 Azure DocumentDB code samples for vector search and algorithm selection quickstart articles.
 
+## Repository Structure
+
+```
+ai/
+├── data/                          # Shared data files (Hotels.json, Hotels_Vector.json)
+├── vector-search-python/          # Python vector search samples
+├── vector-search-typescript/      # TypeScript/Node.js vector search samples
+├── vector-search-go/              # Go vector search samples
+├── vector-search-java/            # Java vector search samples
+├── vector-search-dotnet/          # .NET vector search samples
+├── vector-search-agent-go/        # Go agent sample (separate from quickstart)
+└── vector-search-agent-typescript/ # TypeScript agent sample (separate from quickstart)
+```
+
+Each vector-search sample directory contains:
+- `src/` — Source files: one per algorithm (`ivf`, `hnsw`, `diskann`) + `utils` + `create_embeddings` + `show_indexes`
+- `output/` — Expected output files: `ivf.txt`, `hnsw.txt`, `diskann.txt`
+- `README.md` — Setup, usage, and troubleshooting documentation
+- `.env.example` (Go, Python, TypeScript) or `appsettings.json` (.NET) — Configuration template
+
 ## Language Dependencies
 
 ### Go
@@ -10,11 +30,14 @@ Azure DocumentDB code samples for vector search and algorithm selection quicksta
 - go.mongodb.org/mongo-driver v1.17+
 - github.com/Azure/azure-sdk-for-go/sdk/azidentity
 - github.com/Azure/azure-sdk-for-go/sdk/azcore
+- github.com/openai/openai-go/v3
+- github.com/joho/godotenv
 
 ### Java
 - Java 17+
-- MongoDB Driver 5.3+
-- Azure Identity 1.15+
+- MongoDB Driver (mongodb-driver-sync) 5.3+
+- Azure Identity (azure-identity) 1.15+
+- Azure AI OpenAI (azure-ai-openai)
 - Maven 3.8+
 
 ### Python
@@ -22,6 +45,7 @@ Azure DocumentDB code samples for vector search and algorithm selection quicksta
 - pymongo >= 4.7
 - azure-identity
 - openai
+- python-dotenv
 
 ### TypeScript/Node.js
 - Node.js 20+
@@ -31,8 +55,9 @@ Azure DocumentDB code samples for vector search and algorithm selection quicksta
 
 ### .NET
 - .NET 8+
-- MongoDB.Driver 3.2+
+- MongoDB.Driver 3.0+
 - Azure.Identity
+- Azure.AI.OpenAI
 
 ## Consistent Variable Values
 
@@ -40,16 +65,18 @@ All samples MUST use these environment variable names and defaults:
 
 | Variable | Default | Purpose |
 |----------|---------|---------|
-| MONGO_CLUSTER_NAME | (required) | DocumentDB cluster name |
+| MONGO_CLUSTER_NAME | (required) | DocumentDB cluster name (passwordless auth) |
+| MONGO_CONNECTION_STRING | (none) | Full connection string (connection string auth) |
 | AZURE_OPENAI_EMBEDDING_ENDPOINT | (required) | Azure OpenAI endpoint |
-| AZURE_OPENAI_EMBEDDING_MODEL | (required) | Embedding model deployment |
-| DATA_FILE_WITH_VECTORS | ./Hotels_Vector.json | Path to data file |
+| AZURE_OPENAI_EMBEDDING_MODEL | (required) | Embedding model deployment name |
+| AZURE_OPENAI_EMBEDDING_API_VERSION | 2023-05-15 | Azure OpenAI API version |
+| DATA_FILE_WITH_VECTORS | ../data/Hotels_Vector.json | Path to data file with embeddings |
 | EMBEDDED_FIELD | DescriptionVector | Vector field name in documents |
 | EMBEDDING_DIMENSIONS | 1536 | Vector dimensions |
 | LOAD_SIZE_BATCH | 100 | Batch size for document insertion |
 | EMBEDDING_SIZE_BATCH | 16 | Batch size for embedding generation |
 | AZURE_DOCUMENTDB_DATABASENAME | Hotels | Database name |
-| SIMILARITY | (varies) | Similarity metric (cosine, euclidean, ip) |
+| SIMILARITY | (varies) | Similarity metric (COS, euclidean, ip) |
 | ALGORITHM | (varies) | Algorithm (ivf, hnsw, diskann) |
 
 ## Consistent Algorithm Parameters
@@ -68,11 +95,63 @@ All samples MUST use these environment variable names and defaults:
 - lBuild: 10
 - lSearch: 40
 
+## Authentication
+
+All samples support two authentication modes. **Passwordless (OIDC) is preferred.**
+
+### Passwordless Authentication (Recommended)
+- Uses `DefaultAzureCredential` / OIDC with `MONGO_CLUSTER_NAME`
+- Connection URI format: `mongodb+srv://{clusterName}.global.mongocluster.cosmos.azure.com/`
+- OIDC token scope: `https://ossrdbms-aad.database.windows.net/.default`
+- Each language implements a utility function pair: `getClients()` and `getClientsPasswordless()`
+
+### Connection String Authentication
+- Uses `MONGO_CONNECTION_STRING` with username/password
+- Format: `mongodb+srv://username:password@{cluster}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000`
+
+> **Note:** `mongocluster.cosmos.azure.com` is the current DocumentDB hostname — this is NOT a Cosmos DB reference.
+
+## Sample Execution Pattern
+
+All vector search samples follow this consistent lifecycle:
+
+1. **Initialize clients** — Create MongoDB and Azure OpenAI clients (passwordless preferred)
+2. **Drop collection** — Drop the algorithm-specific collection if it exists (clean start)
+3. **Create collection** — Create a fresh collection
+4. **Load data** — Read `Hotels_Vector.json` and batch-insert documents
+5. **Create vector index** — Create algorithm-specific vector index using `createIndexes` command with `cosmosSearch` key type
+6. **Generate query embedding** — Embed the search query text using Azure OpenAI
+7. **Perform vector search** — Run `$search` aggregation pipeline with `cosmosSearch` operator
+8. **Print results** — Display `HotelName` and `score` for top results
+9. **Cleanup** — Drop the collection in a `finally`/`defer` block
+
+### Naming Conventions
+- **Collection names:** `hotels_{algorithm}` — e.g., `hotels_ivf`, `hotels_hnsw`, `hotels_diskann`
+- **Index names:** `vectorIndex_{algorithm}` — e.g., `vectorIndex_ivf`, `vectorIndex_hnsw`, `vectorIndex_diskann`
+- **Database name:** `Hotels` (hardcoded, matches `AZURE_DOCUMENTDB_DATABASENAME` default)
+
+### Standard Search Query
+All samples use the same query text: `"quintessential lodging near running trails, eateries, retail"`
+
+### Vector Search Pipeline Structure
+All languages use the same aggregation pipeline structure:
+```
+[
+  { "$search": { "cosmosSearch": { "vector": <queryEmbedding>, "path": <vectorField>, "k": 5 } } },
+  { "$project": { "score": { "$meta": "searchScore" }, "document": "$$ROOT" } }
+]
+```
+
+> **Note:** `cosmosSearch` is a valid MongoDB API command name for DocumentDB — this is NOT a Cosmos DB reference.
+
 ## Rules
 
-1. **No Cosmos DB references.** Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com".
+1. **No Cosmos DB references.** Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com". Exception: `mongocluster.cosmos.azure.com` (hostname), `cosmosSearch` (API command), and `ms-azuretools.vscode-cosmosdb` (VS Code extension) are valid and NOT Cosmos references.
 2. **Vector field name is DescriptionVector.** Never default to "contentVector".
-3. **Data file is shared.** All samples reference `../data/Hotels_Vector.json`. READMEs instruct users to copy it locally.
+3. **Data file path from env var.** Code reads `DATA_FILE_WITH_VECTORS` which defaults to `../data/Hotels_Vector.json` (the shared data location). .NET copies data locally to `data/Hotels_Vector.json` in the build output.
 4. **Batch size is LOAD_SIZE_BATCH=100.** Do not use BATCH_SIZE or other variants.
 5. **Database name variable is AZURE_DOCUMENTDB_DATABASENAME.** Do not use MONGO_DB_NAME or other variants.
-6. **.NET uses appsettings.json** with same variable names under a "DocumentDB" section.
+6. **.NET uses appsettings.json** with configuration sections: `AzureOpenAI`, `DataFiles`, `Embedding`, `MongoDB`, `VectorSearch`.
+7. **Similarity metric is COS.** All vector index definitions use `"similarity": "COS"` (cosine similarity).
+8. **Output files are committed.** Each sample has an `output/` directory with expected output for each algorithm (`ivf.txt`, `hnsw.txt`, `diskann.txt`). Update these when output format changes.
+9. **DocumentDB supports all index types at any dataset size.** IVF, HNSW, and DiskANN are all available — do not imply tier restrictions limit algorithm availability.

From 870709f51d720fa214c876f279b19ac0b91f8288 Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:03:30 -0700
Subject: [PATCH 3/8] docs: add agent samples multi-LLM convention to
 copilot-instructions

Document the agent sample patterns as a separate convention section covering:
- Planner/Synthesizer two-agent architecture with 3 LLM deployments
- Agent-specific env vars vs quickstart env var mapping
- Entry point pattern (upload/agent/cleanup)
- Language-specific SDK stacks (Go raw vs TS LangChain)
- OIDC authentication scopes
- IVF numLists discrepancy (10 vs 1)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md | 89 ++++++++++++++++++++++++++++++++-
 1 file changed, 88 insertions(+), 1 deletion(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index b8a4db8..37bd396 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -144,9 +144,96 @@ All languages use the same aggregation pipeline structure:
 
 > **Note:** `cosmosSearch` is a valid MongoDB API command name for DocumentDB — this is NOT a Cosmos DB reference.
 
+## Agent Samples (Multi-LLM Convention)
+
+Agent samples (`vector-search-agent-*`) use a **different convention** from quickstart samples. They orchestrate multiple LLM deployments and use a distinct set of environment variables. Do NOT mix agent conventions with quickstart conventions.
+
+### Architecture: Planner → Synthesizer
+
+Agent samples use a two-agent pipeline with three Azure OpenAI deployments:
+
+| Deployment | Role | Temperature | Purpose |
+|------------|------|-------------|---------|
+| Embedding | Vector search | — | Same as quickstart samples |
+| Planner | Tool-calling agent | 0.0 | Transforms user query → tool call → retrieves search results |
+| Synthesizer | Response generation | 0.3 | Takes search results + query → produces natural language recommendation |
+
+The planner invokes a `search_hotels_collection` tool that performs the vector search. The synthesizer receives the search results and generates a comparative hotel recommendation.
+
+### Agent Entry Points
+
+Agent samples have three separate entry points (not a single main file):
+
+| Entry Point | Purpose |
+|-------------|---------|
+| `upload` | Load hotel data, create embeddings, insert into DocumentDB, create vector index |
+| `agent` | Run planner → synthesizer pipeline against an existing collection |
+| `cleanup` | Drop the database |
+
+### Agent Environment Variables
+
+Agent samples use `AZURE_DOCUMENTDB_*` and `AZURE_OPENAI_*` prefixes consistently. These differ from quickstart variable names.
+
+| Agent Variable | Quickstart Equivalent | Notes |
+|---------------|----------------------|-------|
+| `AZURE_OPENAI_ENDPOINT` | `AZURE_OPENAI_EMBEDDING_ENDPOINT` | Single endpoint for all 3 deployments |
+| `AZURE_OPENAI_API_KEY` | — | For API key auth (not used in quickstarts) |
+| `AZURE_DOCUMENTDB_CLUSTER` | `MONGO_CLUSTER_NAME` | Cluster name for passwordless auth |
+| `AZURE_DOCUMENTDB_CONNECTION_STRING` | `MONGO_CONNECTION_STRING` | Full connection string |
+| `AZURE_DOCUMENTDB_COLLECTION` | — | Collection name (agents parameterize this) |
+| `AZURE_DOCUMENTDB_INDEX_NAME` | — | Vector index name (agents parameterize this) |
+| `VECTOR_INDEX_ALGORITHM` | `ALGORITHM` | Default: `vector-ivf` |
+| `VECTOR_SIMILARITY` | `SIMILARITY` | Default: `COS` |
+| `USE_PASSWORDLESS` | — | `true`/`false` toggle for auth mode |
+| `DEBUG` | — | `true`/`false` verbose logging |
+| `QUERY` | — | Default: `"quintessential lodging near running trails, eateries, retail"` |
+| `NEAREST_NEIGHBORS` | — | Default: `5` |
+
+**Agent-only variables (no quickstart equivalent):**
+
+| Variable | Default | Purpose |
+|----------|---------|---------|
+| `AZURE_OPENAI_EMBEDDING_DEPLOYMENT` | (required) | Embedding model deployment name |
+| `AZURE_OPENAI_EMBEDDING_API_VERSION` | 2024-06-01 (Go), 2023-05-15 (TS) | Embedding API version |
+| `AZURE_OPENAI_PLANNER_DEPLOYMENT` / `AZURE_OPENAI_PLANNER_MODEL` | (required) | Planner LLM deployment |
+| `AZURE_OPENAI_PLANNER_API_VERSION` | (required) | Planner API version |
+| `AZURE_OPENAI_SYNTH_DEPLOYMENT` / `AZURE_OPENAI_SYNTH_MODEL` | (required) | Synthesizer LLM deployment |
+| `AZURE_OPENAI_SYNTH_API_VERSION` | (required) | Synthesizer API version |
+| `IVF_NUM_LISTS` | 10 | IVF numLists (⚠️ differs from quickstart default of 1) |
+| `HNSW_M` | 16 | HNSW m parameter |
+| `HNSW_EF_CONSTRUCTION` | 64 | HNSW efConstruction parameter |
+| `DISKANN_MAX_DEGREE` | 20 | DiskANN maxDegree parameter |
+| `DISKANN_L_BUILD` | 10 | DiskANN lBuild parameter |
+
+### Agent Authentication
+
+Agents support passwordless (OIDC) and API key auth, toggled by `USE_PASSWORDLESS`.
+
+**OIDC scopes:**
+- DocumentDB: `https://ossrdbms-aad.database.windows.net/.default`
+- Azure OpenAI: `https://cognitiveservices.azure.com/.default`
+
+**MongoDB URI (passwordless):** `mongodb+srv://{cluster}.global.mongocluster.cosmos.azure.com/`
+- Auth mechanism: `MONGODB-OIDC` with machine callback
+
+### Language-Specific SDK Stacks
+
+| Language | MongoDB | OpenAI | Agent Framework |
+|----------|---------|--------|-----------------|
+| Go | `go.mongodb.org/mongo-driver` (raw) | `github.com/openai/openai-go/v3` (raw) | Manual tool-calling loop |
+| TypeScript | `mongodb` (cleanup only) | `@langchain/openai` | `langchain` + `@langchain/azure-cosmosdb` + `zod` |
+
+**TypeScript agents use LangChain** — the `@langchain/azure-cosmosdb` package manages the vector store, and `langchain`'s `createAgent` handles tool orchestration. This is a fundamentally different SDK stack from the quickstart TypeScript samples which use the raw MongoDB driver.
+
+**Go agents use raw SDKs** — both MongoDB driver and OpenAI SDK are used directly, with manual tool-calling implementation.
+
+### IVF numLists Discrepancy
+
+Agent samples default to `IVF_NUM_LISTS=10`. Quickstart samples (vector-search, select-algorithm) hardcode `numLists=1`. This is intentional — agent samples are designed for tunable, production-like configurations while quickstart samples use minimal values for simplicity.
+
 ## Rules
 
-1. **No Cosmos DB references.** Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com". Exception: `mongocluster.cosmos.azure.com` (hostname), `cosmosSearch` (API command), and `ms-azuretools.vscode-cosmosdb` (VS Code extension) are valid and NOT Cosmos references.
+1. **No Cosmos DB references.**Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com". Exception: `mongocluster.cosmos.azure.com` (hostname), `cosmosSearch` (API command), and `ms-azuretools.vscode-cosmosdb` (VS Code extension) are valid and NOT Cosmos references.
 2. **Vector field name is DescriptionVector.** Never default to "contentVector".
 3. **Data file path from env var.** Code reads `DATA_FILE_WITH_VECTORS` which defaults to `../data/Hotels_Vector.json` (the shared data location). .NET copies data locally to `data/Hotels_Vector.json` in the build output.
 4. **Batch size is LOAD_SIZE_BATCH=100.** Do not use BATCH_SIZE or other variants.

From 0046a4e3b447dc437a58d1d99d45dd3985caa58c Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:07:45 -0700
Subject: [PATCH 4/8] docs: replace dotenv with CLI env var invocation examples

- Remove python-dotenv and godotenv from dependency lists
- Add Rule 10: no dotenv libraries
- Add 'Running Samples' section with per-language CLI examples
- Include agent sample multi-LLM invocation example
- Add Windows PowerShell note

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md | 76 ++++++++++++++++++++++++++++++++-
 1 file changed, 74 insertions(+), 2 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 37bd396..584c6e6 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -31,7 +31,6 @@ Each vector-search sample directory contains:
 - github.com/Azure/azure-sdk-for-go/sdk/azidentity
 - github.com/Azure/azure-sdk-for-go/sdk/azcore
 - github.com/openai/openai-go/v3
-- github.com/joho/godotenv
 
 ### Java
 - Java 17+
@@ -45,7 +44,6 @@ Each vector-search sample directory contains:
 - pymongo >= 4.7
 - azure-identity
 - openai
-- python-dotenv
 
 ### TypeScript/Node.js
 - Node.js 20+
@@ -242,3 +240,77 @@ Agent samples default to `IVF_NUM_LISTS=10`. Quickstart samples (vector-search,
 7. **Similarity metric is COS.** All vector index definitions use `"similarity": "COS"` (cosine similarity).
 8. **Output files are committed.** Each sample has an `output/` directory with expected output for each algorithm (`ivf.txt`, `hnsw.txt`, `diskann.txt`). Update these when output format changes.
 9. **DocumentDB supports all index types at any dataset size.** IVF, HNSW, and DiskANN are all available — do not imply tier restrictions limit algorithm availability.
+10. **No dotenv libraries.** Do NOT use `python-dotenv`, `godotenv`, `dotenv` (npm), or any `.env` file-loading library. Environment variables must be passed via the CLI invocation, not loaded from `.env` files at runtime. This keeps samples explicit and avoids hidden configuration.
+
+## Running Samples — CLI Invocation
+
+Environment variables are passed inline with the run command. Do NOT use `.env` files. Each example below shows the required variables for a vector-search quickstart sample.
+
+### Go
+
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+go run ./src/ivf.go
+```
+
+### Python
+
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+python src/ivf.py
+```
+
+### TypeScript/Node.js
+
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+npx tsx src/ivf.ts
+```
+
+### Java
+
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
+```
+
+### .NET
+
+.NET uses `appsettings.json` for configuration, but environment variables can override:
+
+```bash
+DocumentDB__ClusterName=myCluster \
+AzureOpenAI__Endpoint=https://myendpoint.openai.azure.com/ \
+AzureOpenAI__DeploymentName=text-embedding-ada-002 \
+dotnet run
+```
+
+### Agent Samples (Multi-LLM)
+
+Agent samples require more variables for the planner and synthesizer deployments:
+
+```bash
+AZURE_OPENAI_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 \
+AZURE_OPENAI_EMBEDDING_API_VERSION=2024-06-01 \
+AZURE_OPENAI_PLANNER_DEPLOYMENT=gpt-4o \
+AZURE_OPENAI_PLANNER_API_VERSION=2024-06-01 \
+AZURE_OPENAI_SYNTH_DEPLOYMENT=gpt-4o \
+AZURE_OPENAI_SYNTH_API_VERSION=2024-06-01 \
+AZURE_DOCUMENTDB_CLUSTER=myCluster \
+AZURE_DOCUMENTDB_DATABASENAME=Hotels \
+AZURE_DOCUMENTDB_COLLECTION=hotels \
+AZURE_DOCUMENTDB_INDEX_NAME=vectorIndex \
+USE_PASSWORDLESS=true \
+go run ./cmd/agent/main.go
+```
+
+> **Windows (PowerShell):** Use `$env:VAR_NAME="value";` syntax or set variables beforehand with `$env:MONGO_CLUSTER_NAME="myCluster"` then run the command separately.

From 9fec92f1cdd5fa5efe817ddce8f097d590b5867f Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:12:58 -0700
Subject: [PATCH 5/8] docs: add PowerShell examples alongside bash for all CLI
 invocations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md | 63 ++++++++++++++++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 584c6e6..c5a575e 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -248,6 +248,7 @@ Environment variables are passed inline with the run command. Do NOT use `.env`
 
 ### Go
 
+**Bash:**
 ```bash
 MONGO_CLUSTER_NAME=myCluster \
 AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
@@ -255,8 +256,17 @@ AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
 go run ./src/ivf.go
 ```
 
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+go run ./src/ivf.go
+```
+
 ### Python
 
+**Bash:**
 ```bash
 MONGO_CLUSTER_NAME=myCluster \
 AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
@@ -264,8 +274,17 @@ AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
 python src/ivf.py
 ```
 
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+python src/ivf.py
+```
+
 ### TypeScript/Node.js
 
+**Bash:**
 ```bash
 MONGO_CLUSTER_NAME=myCluster \
 AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
@@ -273,8 +292,17 @@ AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
 npx tsx src/ivf.ts
 ```
 
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+npx tsx src/ivf.ts
+```
+
 ### Java
 
+**Bash:**
 ```bash
 MONGO_CLUSTER_NAME=myCluster \
 AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
@@ -282,10 +310,19 @@ AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
 mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
 ```
 
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
+```
+
 ### .NET
 
 .NET uses `appsettings.json` for configuration, but environment variables can override:
 
+**Bash:**
 ```bash
 DocumentDB__ClusterName=myCluster \
 AzureOpenAI__Endpoint=https://myendpoint.openai.azure.com/ \
@@ -293,10 +330,19 @@ AzureOpenAI__DeploymentName=text-embedding-ada-002 \
 dotnet run
 ```
 
+**PowerShell:**
+```powershell
+$env:DocumentDB__ClusterName="myCluster"
+$env:AzureOpenAI__Endpoint="https://myendpoint.openai.azure.com/"
+$env:AzureOpenAI__DeploymentName="text-embedding-ada-002"
+dotnet run
+```
+
 ### Agent Samples (Multi-LLM)
 
 Agent samples require more variables for the planner and synthesizer deployments:
 
+**Bash:**
 ```bash
 AZURE_OPENAI_ENDPOINT=https://myendpoint.openai.azure.com/ \
 AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 \
@@ -313,4 +359,19 @@ USE_PASSWORDLESS=true \
 go run ./cmd/agent/main.go
 ```
 
-> **Windows (PowerShell):** Use `$env:VAR_NAME="value";` syntax or set variables beforehand with `$env:MONGO_CLUSTER_NAME="myCluster"` then run the command separately.
+**PowerShell:**
+```powershell
+$env:AZURE_OPENAI_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT="text-embedding-ada-002"
+$env:AZURE_OPENAI_EMBEDDING_API_VERSION="2024-06-01"
+$env:AZURE_OPENAI_PLANNER_DEPLOYMENT="gpt-4o"
+$env:AZURE_OPENAI_PLANNER_API_VERSION="2024-06-01"
+$env:AZURE_OPENAI_SYNTH_DEPLOYMENT="gpt-4o"
+$env:AZURE_OPENAI_SYNTH_API_VERSION="2024-06-01"
+$env:AZURE_DOCUMENTDB_CLUSTER="myCluster"
+$env:AZURE_DOCUMENTDB_DATABASENAME="Hotels"
+$env:AZURE_DOCUMENTDB_COLLECTION="hotels"
+$env:AZURE_DOCUMENTDB_INDEX_NAME="vectorIndex"
+$env:USE_PASSWORDLESS="true"
+go run ./cmd/agent/main.go
+```

From 89e7ae4aa3a3365f562d846c3e84b190394b4100 Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:23:23 -0700
Subject: [PATCH 6/8] refactor: split copilot-instructions into scoped
 instruction files

- Main file trimmed from 377 to 156 lines (core rules, env vars, patterns)
- .github/instructions/cli-examples.instructions.md: Bash/PowerShell invocation
  examples, scoped to ai/** paths
- .github/instructions/agent-samples.instructions.md: Multi-LLM agent convention,
  scoped to ai/vector-search-agent-*/** paths

Follows GitHub Copilot guidance to keep copilot-instructions.md concise and use
path-specific .instructions.md files for detailed/scoped content.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md               | 221 ------------------
 .../agent-samples.instructions.md             |  89 +++++++
 .../instructions/cli-examples.instructions.md | 136 +++++++++++
 3 files changed, 225 insertions(+), 221 deletions(-)
 create mode 100644 .github/instructions/agent-samples.instructions.md
 create mode 100644 .github/instructions/cli-examples.instructions.md

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index c5a575e..732fca5 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -142,93 +142,6 @@ All languages use the same aggregation pipeline structure:
 
 > **Note:** `cosmosSearch` is a valid MongoDB API command name for DocumentDB — this is NOT a Cosmos DB reference.
 
-## Agent Samples (Multi-LLM Convention)
-
-Agent samples (`vector-search-agent-*`) use a **different convention** from quickstart samples. They orchestrate multiple LLM deployments and use a distinct set of environment variables. Do NOT mix agent conventions with quickstart conventions.
-
-### Architecture: Planner → Synthesizer
-
-Agent samples use a two-agent pipeline with three Azure OpenAI deployments:
-
-| Deployment | Role | Temperature | Purpose |
-|------------|------|-------------|---------|
-| Embedding | Vector search | — | Same as quickstart samples |
-| Planner | Tool-calling agent | 0.0 | Transforms user query → tool call → retrieves search results |
-| Synthesizer | Response generation | 0.3 | Takes search results + query → produces natural language recommendation |
-
-The planner invokes a `search_hotels_collection` tool that performs the vector search. The synthesizer receives the search results and generates a comparative hotel recommendation.
-
-### Agent Entry Points
-
-Agent samples have three separate entry points (not a single main file):
-
-| Entry Point | Purpose |
-|-------------|---------|
-| `upload` | Load hotel data, create embeddings, insert into DocumentDB, create vector index |
-| `agent` | Run planner → synthesizer pipeline against an existing collection |
-| `cleanup` | Drop the database |
-
-### Agent Environment Variables
-
-Agent samples use `AZURE_DOCUMENTDB_*` and `AZURE_OPENAI_*` prefixes consistently. These differ from quickstart variable names.
-
-| Agent Variable | Quickstart Equivalent | Notes |
-|---------------|----------------------|-------|
-| `AZURE_OPENAI_ENDPOINT` | `AZURE_OPENAI_EMBEDDING_ENDPOINT` | Single endpoint for all 3 deployments |
-| `AZURE_OPENAI_API_KEY` | — | For API key auth (not used in quickstarts) |
-| `AZURE_DOCUMENTDB_CLUSTER` | `MONGO_CLUSTER_NAME` | Cluster name for passwordless auth |
-| `AZURE_DOCUMENTDB_CONNECTION_STRING` | `MONGO_CONNECTION_STRING` | Full connection string |
-| `AZURE_DOCUMENTDB_COLLECTION` | — | Collection name (agents parameterize this) |
-| `AZURE_DOCUMENTDB_INDEX_NAME` | — | Vector index name (agents parameterize this) |
-| `VECTOR_INDEX_ALGORITHM` | `ALGORITHM` | Default: `vector-ivf` |
-| `VECTOR_SIMILARITY` | `SIMILARITY` | Default: `COS` |
-| `USE_PASSWORDLESS` | — | `true`/`false` toggle for auth mode |
-| `DEBUG` | — | `true`/`false` verbose logging |
-| `QUERY` | — | Default: `"quintessential lodging near running trails, eateries, retail"` |
-| `NEAREST_NEIGHBORS` | — | Default: `5` |
-
-**Agent-only variables (no quickstart equivalent):**
-
-| Variable | Default | Purpose |
-|----------|---------|---------|
-| `AZURE_OPENAI_EMBEDDING_DEPLOYMENT` | (required) | Embedding model deployment name |
-| `AZURE_OPENAI_EMBEDDING_API_VERSION` | 2024-06-01 (Go), 2023-05-15 (TS) | Embedding API version |
-| `AZURE_OPENAI_PLANNER_DEPLOYMENT` / `AZURE_OPENAI_PLANNER_MODEL` | (required) | Planner LLM deployment |
-| `AZURE_OPENAI_PLANNER_API_VERSION` | (required) | Planner API version |
-| `AZURE_OPENAI_SYNTH_DEPLOYMENT` / `AZURE_OPENAI_SYNTH_MODEL` | (required) | Synthesizer LLM deployment |
-| `AZURE_OPENAI_SYNTH_API_VERSION` | (required) | Synthesizer API version |
-| `IVF_NUM_LISTS` | 10 | IVF numLists (⚠️ differs from quickstart default of 1) |
-| `HNSW_M` | 16 | HNSW m parameter |
-| `HNSW_EF_CONSTRUCTION` | 64 | HNSW efConstruction parameter |
-| `DISKANN_MAX_DEGREE` | 20 | DiskANN maxDegree parameter |
-| `DISKANN_L_BUILD` | 10 | DiskANN lBuild parameter |
-
-### Agent Authentication
-
-Agents support passwordless (OIDC) and API key auth, toggled by `USE_PASSWORDLESS`.
-
-**OIDC scopes:**
-- DocumentDB: `https://ossrdbms-aad.database.windows.net/.default`
-- Azure OpenAI: `https://cognitiveservices.azure.com/.default`
-
-**MongoDB URI (passwordless):** `mongodb+srv://{cluster}.global.mongocluster.cosmos.azure.com/`
-- Auth mechanism: `MONGODB-OIDC` with machine callback
-
-### Language-Specific SDK Stacks
-
-| Language | MongoDB | OpenAI | Agent Framework |
-|----------|---------|--------|-----------------|
-| Go | `go.mongodb.org/mongo-driver` (raw) | `github.com/openai/openai-go/v3` (raw) | Manual tool-calling loop |
-| TypeScript | `mongodb` (cleanup only) | `@langchain/openai` | `langchain` + `@langchain/azure-cosmosdb` + `zod` |
-
-**TypeScript agents use LangChain** — the `@langchain/azure-cosmosdb` package manages the vector store, and `langchain`'s `createAgent` handles tool orchestration. This is a fundamentally different SDK stack from the quickstart TypeScript samples which use the raw MongoDB driver.
-
-**Go agents use raw SDKs** — both MongoDB driver and OpenAI SDK are used directly, with manual tool-calling implementation.
-
-### IVF numLists Discrepancy
-
-Agent samples default to `IVF_NUM_LISTS=10`. Quickstart samples (vector-search, select-algorithm) hardcode `numLists=1`. This is intentional — agent samples are designed for tunable, production-like configurations while quickstart samples use minimal values for simplicity.
-
 ## Rules
 
 1. **No Cosmos DB references.**Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com". Exception: `mongocluster.cosmos.azure.com` (hostname), `cosmosSearch` (API command), and `ms-azuretools.vscode-cosmosdb` (VS Code extension) are valid and NOT Cosmos references.
@@ -241,137 +154,3 @@ Agent samples default to `IVF_NUM_LISTS=10`. Quickstart samples (vector-search,
 8. **Output files are committed.** Each sample has an `output/` directory with expected output for each algorithm (`ivf.txt`, `hnsw.txt`, `diskann.txt`). Update these when output format changes.
 9. **DocumentDB supports all index types at any dataset size.** IVF, HNSW, and DiskANN are all available — do not imply tier restrictions limit algorithm availability.
 10. **No dotenv libraries.** Do NOT use `python-dotenv`, `godotenv`, `dotenv` (npm), or any `.env` file-loading library. Environment variables must be passed via the CLI invocation, not loaded from `.env` files at runtime. This keeps samples explicit and avoids hidden configuration.
-
-## Running Samples — CLI Invocation
-
-Environment variables are passed inline with the run command. Do NOT use `.env` files. Each example below shows the required variables for a vector-search quickstart sample.
-
-### Go
-
-**Bash:**
-```bash
-MONGO_CLUSTER_NAME=myCluster \
-AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
-AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
-go run ./src/ivf.go
-```
-
-**PowerShell:**
-```powershell
-$env:MONGO_CLUSTER_NAME="myCluster"
-$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
-$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
-go run ./src/ivf.go
-```
-
-### Python
-
-**Bash:**
-```bash
-MONGO_CLUSTER_NAME=myCluster \
-AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
-AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
-python src/ivf.py
-```
-
-**PowerShell:**
-```powershell
-$env:MONGO_CLUSTER_NAME="myCluster"
-$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
-$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
-python src/ivf.py
-```
-
-### TypeScript/Node.js
-
-**Bash:**
-```bash
-MONGO_CLUSTER_NAME=myCluster \
-AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
-AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
-npx tsx src/ivf.ts
-```
-
-**PowerShell:**
-```powershell
-$env:MONGO_CLUSTER_NAME="myCluster"
-$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
-$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
-npx tsx src/ivf.ts
-```
-
-### Java
-
-**Bash:**
-```bash
-MONGO_CLUSTER_NAME=myCluster \
-AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
-AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
-mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
-```
-
-**PowerShell:**
-```powershell
-$env:MONGO_CLUSTER_NAME="myCluster"
-$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
-$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
-mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
-```
-
-### .NET
-
-.NET uses `appsettings.json` for configuration, but environment variables can override:
-
-**Bash:**
-```bash
-DocumentDB__ClusterName=myCluster \
-AzureOpenAI__Endpoint=https://myendpoint.openai.azure.com/ \
-AzureOpenAI__DeploymentName=text-embedding-ada-002 \
-dotnet run
-```
-
-**PowerShell:**
-```powershell
-$env:DocumentDB__ClusterName="myCluster"
-$env:AzureOpenAI__Endpoint="https://myendpoint.openai.azure.com/"
-$env:AzureOpenAI__DeploymentName="text-embedding-ada-002"
-dotnet run
-```
-
-### Agent Samples (Multi-LLM)
-
-Agent samples require more variables for the planner and synthesizer deployments:
-
-**Bash:**
-```bash
-AZURE_OPENAI_ENDPOINT=https://myendpoint.openai.azure.com/ \
-AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 \
-AZURE_OPENAI_EMBEDDING_API_VERSION=2024-06-01 \
-AZURE_OPENAI_PLANNER_DEPLOYMENT=gpt-4o \
-AZURE_OPENAI_PLANNER_API_VERSION=2024-06-01 \
-AZURE_OPENAI_SYNTH_DEPLOYMENT=gpt-4o \
-AZURE_OPENAI_SYNTH_API_VERSION=2024-06-01 \
-AZURE_DOCUMENTDB_CLUSTER=myCluster \
-AZURE_DOCUMENTDB_DATABASENAME=Hotels \
-AZURE_DOCUMENTDB_COLLECTION=hotels \
-AZURE_DOCUMENTDB_INDEX_NAME=vectorIndex \
-USE_PASSWORDLESS=true \
-go run ./cmd/agent/main.go
-```
-
-**PowerShell:**
-```powershell
-$env:AZURE_OPENAI_ENDPOINT="https://myendpoint.openai.azure.com/"
-$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT="text-embedding-ada-002"
-$env:AZURE_OPENAI_EMBEDDING_API_VERSION="2024-06-01"
-$env:AZURE_OPENAI_PLANNER_DEPLOYMENT="gpt-4o"
-$env:AZURE_OPENAI_PLANNER_API_VERSION="2024-06-01"
-$env:AZURE_OPENAI_SYNTH_DEPLOYMENT="gpt-4o"
-$env:AZURE_OPENAI_SYNTH_API_VERSION="2024-06-01"
-$env:AZURE_DOCUMENTDB_CLUSTER="myCluster"
-$env:AZURE_DOCUMENTDB_DATABASENAME="Hotels"
-$env:AZURE_DOCUMENTDB_COLLECTION="hotels"
-$env:AZURE_DOCUMENTDB_INDEX_NAME="vectorIndex"
-$env:USE_PASSWORDLESS="true"
-go run ./cmd/agent/main.go
-```
diff --git a/.github/instructions/agent-samples.instructions.md b/.github/instructions/agent-samples.instructions.md
new file mode 100644
index 0000000..5c94b42
--- /dev/null
+++ b/.github/instructions/agent-samples.instructions.md
@@ -0,0 +1,89 @@
+---
+applyTo: "ai/vector-search-agent-*/**"
+---
+# Agent Samples (Multi-LLM Convention)
+
+Agent samples (`vector-search-agent-*`) use a **different convention** from quickstart samples. They orchestrate multiple LLM deployments and use a distinct set of environment variables. Do NOT mix agent conventions with quickstart conventions.
+
+## Architecture: Planner → Synthesizer
+
+Agent samples use a two-agent pipeline with three Azure OpenAI deployments:
+
+| Deployment | Role | Temperature | Purpose |
+|------------|------|-------------|---------|
+| Embedding | Vector search | — | Same as quickstart samples |
+| Planner | Tool-calling agent | 0.0 | Transforms user query → tool call → retrieves search results |
+| Synthesizer | Response generation | 0.3 | Takes search results + query → produces natural language recommendation |
+
+The planner invokes a `search_hotels_collection` tool that performs the vector search. The synthesizer receives the search results and generates a comparative hotel recommendation.
+
+## Agent Entry Points
+
+Agent samples have three separate entry points (not a single main file):
+
+| Entry Point | Purpose |
+|-------------|---------|
+| `upload` | Load hotel data, create embeddings, insert into DocumentDB, create vector index |
+| `agent` | Run planner → synthesizer pipeline against an existing collection |
+| `cleanup` | Drop the database |
+
+## Agent Environment Variables
+
+Agent samples use `AZURE_DOCUMENTDB_*` and `AZURE_OPENAI_*` prefixes consistently. These differ from quickstart variable names.
+
+| Agent Variable | Quickstart Equivalent | Notes |
+|---------------|----------------------|-------|
+| `AZURE_OPENAI_ENDPOINT` | `AZURE_OPENAI_EMBEDDING_ENDPOINT` | Single endpoint for all 3 deployments |
+| `AZURE_OPENAI_API_KEY` | — | For API key auth (not used in quickstarts) |
+| `AZURE_DOCUMENTDB_CLUSTER` | `MONGO_CLUSTER_NAME` | Cluster name for passwordless auth |
+| `AZURE_DOCUMENTDB_CONNECTION_STRING` | `MONGO_CONNECTION_STRING` | Full connection string |
+| `AZURE_DOCUMENTDB_COLLECTION` | — | Collection name (agents parameterize this) |
+| `AZURE_DOCUMENTDB_INDEX_NAME` | — | Vector index name (agents parameterize this) |
+| `VECTOR_INDEX_ALGORITHM` | `ALGORITHM` | Default: `vector-ivf` |
+| `VECTOR_SIMILARITY` | `SIMILARITY` | Default: `COS` |
+| `USE_PASSWORDLESS` | — | `true`/`false` toggle for auth mode |
+| `DEBUG` | — | `true`/`false` verbose logging |
+| `QUERY` | — | Default: `"quintessential lodging near running trails, eateries, retail"` |
+| `NEAREST_NEIGHBORS` | — | Default: `5` |
+
+**Agent-only variables (no quickstart equivalent):**
+
+| Variable | Default | Purpose |
+|----------|---------|---------|
+| `AZURE_OPENAI_EMBEDDING_DEPLOYMENT` | (required) | Embedding model deployment name |
+| `AZURE_OPENAI_EMBEDDING_API_VERSION` | 2024-06-01 (Go), 2023-05-15 (TS) | Embedding API version |
+| `AZURE_OPENAI_PLANNER_DEPLOYMENT` / `AZURE_OPENAI_PLANNER_MODEL` | (required) | Planner LLM deployment |
+| `AZURE_OPENAI_PLANNER_API_VERSION` | (required) | Planner API version |
+| `AZURE_OPENAI_SYNTH_DEPLOYMENT` / `AZURE_OPENAI_SYNTH_MODEL` | (required) | Synthesizer LLM deployment |
+| `AZURE_OPENAI_SYNTH_API_VERSION` | (required) | Synthesizer API version |
+| `IVF_NUM_LISTS` | 10 | IVF numLists (⚠️ differs from quickstart default of 1) |
+| `HNSW_M` | 16 | HNSW m parameter |
+| `HNSW_EF_CONSTRUCTION` | 64 | HNSW efConstruction parameter |
+| `DISKANN_MAX_DEGREE` | 20 | DiskANN maxDegree parameter |
+| `DISKANN_L_BUILD` | 10 | DiskANN lBuild parameter |
+
+## Agent Authentication
+
+Agents support passwordless (OIDC) and API key auth, toggled by `USE_PASSWORDLESS`.
+
+**OIDC scopes:**
+- DocumentDB: `https://ossrdbms-aad.database.windows.net/.default`
+- Azure OpenAI: `https://cognitiveservices.azure.com/.default`
+
+**MongoDB URI (passwordless):** `mongodb+srv://{cluster}.global.mongocluster.cosmos.azure.com/`
+- Auth mechanism: `MONGODB-OIDC` with machine callback
+
+## Language-Specific SDK Stacks
+
+| Language | MongoDB | OpenAI | Agent Framework |
+|----------|---------|--------|-----------------|
+| Go | `go.mongodb.org/mongo-driver` (raw) | `github.com/openai/openai-go/v3` (raw) | Manual tool-calling loop |
+| TypeScript | `mongodb` (cleanup only) | `@langchain/openai` | `langchain` + `@langchain/azure-cosmosdb` + `zod` |
+
+**TypeScript agents use LangChain** — the `@langchain/azure-cosmosdb` package manages the vector store, and `langchain`'s `createAgent` handles tool orchestration. This is a fundamentally different SDK stack from the quickstart TypeScript samples which use the raw MongoDB driver.
+
+**Go agents use raw SDKs** — both MongoDB driver and OpenAI SDK are used directly, with manual tool-calling implementation.
+
+## IVF numLists Discrepancy
+
+Agent samples default to `IVF_NUM_LISTS=10`. Quickstart samples (vector-search, select-algorithm) hardcode `numLists=1`. This is intentional — agent samples are designed for tunable, production-like configurations while quickstart samples use minimal values for simplicity.
diff --git a/.github/instructions/cli-examples.instructions.md b/.github/instructions/cli-examples.instructions.md
new file mode 100644
index 0000000..a85c2e9
--- /dev/null
+++ b/.github/instructions/cli-examples.instructions.md
@@ -0,0 +1,136 @@
+---
+applyTo: "ai/**"
+---
+# Running Samples — CLI Invocation
+
+Environment variables are passed inline with the run command. Do NOT use `.env` files. Each example below shows the required variables for a vector-search quickstart sample.
+
+## Go
+
+**Bash:**
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+go run ./src/ivf.go
+```
+
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+go run ./src/ivf.go
+```
+
+## Python
+
+**Bash:**
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+python src/ivf.py
+```
+
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+python src/ivf.py
+```
+
+## TypeScript/Node.js
+
+**Bash:**
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+npx tsx src/ivf.ts
+```
+
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+npx tsx src/ivf.ts
+```
+
+## Java
+
+**Bash:**
+```bash
+MONGO_CLUSTER_NAME=myCluster \
+AZURE_OPENAI_EMBEDDING_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002 \
+mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
+```
+
+**PowerShell:**
+```powershell
+$env:MONGO_CLUSTER_NAME="myCluster"
+$env:AZURE_OPENAI_EMBEDDING_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_MODEL="text-embedding-ada-002"
+mvn compile exec:java -Dexec.mainClass="com.azure.documentdb.sample.IVF"
+```
+
+## .NET
+
+.NET uses `appsettings.json` for configuration, but environment variables can override:
+
+**Bash:**
+```bash
+DocumentDB__ClusterName=myCluster \
+AzureOpenAI__Endpoint=https://myendpoint.openai.azure.com/ \
+AzureOpenAI__DeploymentName=text-embedding-ada-002 \
+dotnet run
+```
+
+**PowerShell:**
+```powershell
+$env:DocumentDB__ClusterName="myCluster"
+$env:AzureOpenAI__Endpoint="https://myendpoint.openai.azure.com/"
+$env:AzureOpenAI__DeploymentName="text-embedding-ada-002"
+dotnet run
+```
+
+## Agent Samples (Multi-LLM)
+
+Agent samples require more variables for the planner and synthesizer deployments:
+
+**Bash:**
+```bash
+AZURE_OPENAI_ENDPOINT=https://myendpoint.openai.azure.com/ \
+AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-ada-002 \
+AZURE_OPENAI_EMBEDDING_API_VERSION=2024-06-01 \
+AZURE_OPENAI_PLANNER_DEPLOYMENT=gpt-4o \
+AZURE_OPENAI_PLANNER_API_VERSION=2024-06-01 \
+AZURE_OPENAI_SYNTH_DEPLOYMENT=gpt-4o \
+AZURE_OPENAI_SYNTH_API_VERSION=2024-06-01 \
+AZURE_DOCUMENTDB_CLUSTER=myCluster \
+AZURE_DOCUMENTDB_DATABASENAME=Hotels \
+AZURE_DOCUMENTDB_COLLECTION=hotels \
+AZURE_DOCUMENTDB_INDEX_NAME=vectorIndex \
+USE_PASSWORDLESS=true \
+go run ./cmd/agent/main.go
+```
+
+**PowerShell:**
+```powershell
+$env:AZURE_OPENAI_ENDPOINT="https://myendpoint.openai.azure.com/"
+$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT="text-embedding-ada-002"
+$env:AZURE_OPENAI_EMBEDDING_API_VERSION="2024-06-01"
+$env:AZURE_OPENAI_PLANNER_DEPLOYMENT="gpt-4o"
+$env:AZURE_OPENAI_PLANNER_API_VERSION="2024-06-01"
+$env:AZURE_OPENAI_SYNTH_DEPLOYMENT="gpt-4o"
+$env:AZURE_OPENAI_SYNTH_API_VERSION="2024-06-01"
+$env:AZURE_DOCUMENTDB_CLUSTER="myCluster"
+$env:AZURE_DOCUMENTDB_DATABASENAME="Hotels"
+$env:AZURE_DOCUMENTDB_COLLECTION="hotels"
+$env:AZURE_DOCUMENTDB_INDEX_NAME="vectorIndex"
+$env:USE_PASSWORDLESS="true"
+go run ./cmd/agent/main.go
+```

From 28db9284d6903623c965d2afd5e68fc54b676925 Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:27:39 -0700
Subject: [PATCH 7/8] refactor: extract auth & execution patterns to scoped
 instruction file
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- copilot-instructions.md now 107 lines (from 377 original)
- New: .github/instructions/execution-patterns.instructions.md
  Scoped to ai/vector-search-*/** — covers auth modes, lifecycle,
  naming conventions, search query, pipeline structure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md               | 49 -----------------
 .../execution-patterns.instructions.md        | 53 +++++++++++++++++++
 2 files changed, 53 insertions(+), 49 deletions(-)
 create mode 100644 .github/instructions/execution-patterns.instructions.md

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 732fca5..e4abf6e 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -93,55 +93,6 @@ All samples MUST use these environment variable names and defaults:
 - lBuild: 10
 - lSearch: 40
 
-## Authentication
-
-All samples support two authentication modes. **Passwordless (OIDC) is preferred.**
-
-### Passwordless Authentication (Recommended)
-- Uses `DefaultAzureCredential` / OIDC with `MONGO_CLUSTER_NAME`
-- Connection URI format: `mongodb+srv://{clusterName}.global.mongocluster.cosmos.azure.com/`
-- OIDC token scope: `https://ossrdbms-aad.database.windows.net/.default`
-- Each language implements a utility function pair: `getClients()` and `getClientsPasswordless()`
-
-### Connection String Authentication
-- Uses `MONGO_CONNECTION_STRING` with username/password
-- Format: `mongodb+srv://username:password@{cluster}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000`
-
-> **Note:** `mongocluster.cosmos.azure.com` is the current DocumentDB hostname — this is NOT a Cosmos DB reference.
-
-## Sample Execution Pattern
-
-All vector search samples follow this consistent lifecycle:
-
-1. **Initialize clients** — Create MongoDB and Azure OpenAI clients (passwordless preferred)
-2. **Drop collection** — Drop the algorithm-specific collection if it exists (clean start)
-3. **Create collection** — Create a fresh collection
-4. **Load data** — Read `Hotels_Vector.json` and batch-insert documents
-5. **Create vector index** — Create algorithm-specific vector index using `createIndexes` command with `cosmosSearch` key type
-6. **Generate query embedding** — Embed the search query text using Azure OpenAI
-7. **Perform vector search** — Run `$search` aggregation pipeline with `cosmosSearch` operator
-8. **Print results** — Display `HotelName` and `score` for top results
-9. **Cleanup** — Drop the collection in a `finally`/`defer` block
-
-### Naming Conventions
-- **Collection names:** `hotels_{algorithm}` — e.g., `hotels_ivf`, `hotels_hnsw`, `hotels_diskann`
-- **Index names:** `vectorIndex_{algorithm}` — e.g., `vectorIndex_ivf`, `vectorIndex_hnsw`, `vectorIndex_diskann`
-- **Database name:** `Hotels` (hardcoded, matches `AZURE_DOCUMENTDB_DATABASENAME` default)
-
-### Standard Search Query
-All samples use the same query text: `"quintessential lodging near running trails, eateries, retail"`
-
-### Vector Search Pipeline Structure
-All languages use the same aggregation pipeline structure:
-```
-[
-  { "$search": { "cosmosSearch": { "vector": <queryEmbedding>, "path": <vectorField>, "k": 5 } } },
-  { "$project": { "score": { "$meta": "searchScore" }, "document": "$$ROOT" } }
-]
-```
-
-> **Note:** `cosmosSearch` is a valid MongoDB API command name for DocumentDB — this is NOT a Cosmos DB reference.
-
 ## Rules
 
 1. **No Cosmos DB references.**Never use "Cosmos DB", "cosmosdb", "MongoDB vCore", or "mongo.cosmos.azure.com". Always use "Azure DocumentDB" and "documentdb.azure.com". Exception: `mongocluster.cosmos.azure.com` (hostname), `cosmosSearch` (API command), and `ms-azuretools.vscode-cosmosdb` (VS Code extension) are valid and NOT Cosmos references.
diff --git a/.github/instructions/execution-patterns.instructions.md b/.github/instructions/execution-patterns.instructions.md
new file mode 100644
index 0000000..91b1b83
--- /dev/null
+++ b/.github/instructions/execution-patterns.instructions.md
@@ -0,0 +1,53 @@
+---
+applyTo: "ai/vector-search-*/**"
+---
+# Sample Execution Patterns
+
+## Authentication
+
+All samples support two authentication modes. **Passwordless (OIDC) is preferred.**
+
+### Passwordless Authentication (Recommended)
+- Uses `DefaultAzureCredential` / OIDC with `MONGO_CLUSTER_NAME`
+- Connection URI format: `mongodb+srv://{clusterName}.global.mongocluster.cosmos.azure.com/`
+- OIDC token scope: `https://ossrdbms-aad.database.windows.net/.default`
+- Each language implements a utility function pair: `getClients()` and `getClientsPasswordless()`
+
+### Connection String Authentication
+- Uses `MONGO_CONNECTION_STRING` with username/password
+- Format: `mongodb+srv://username:password@{cluster}.mongocluster.cosmos.azure.com/?tls=true&authMechanism=SCRAM-SHA-256&retrywrites=false&maxIdleTimeMS=120000`
+
+> **Note:** `mongocluster.cosmos.azure.com` is the current DocumentDB hostname — this is NOT a Cosmos DB reference.
+
+## Sample Execution Pattern
+
+All vector search samples follow this consistent lifecycle:
+
+1. **Initialize clients** — Create MongoDB and Azure OpenAI clients (passwordless preferred)
+2. **Drop collection** — Drop the algorithm-specific collection if it exists (clean start)
+3. **Create collection** — Create a fresh collection
+4. **Load data** — Read `Hotels_Vector.json` and batch-insert documents
+5. **Create vector index** — Create algorithm-specific vector index using `createIndexes` command with `cosmosSearch` key type
+6. **Generate query embedding** — Embed the search query text using Azure OpenAI
+7. **Perform vector search** — Run `$search` aggregation pipeline with `cosmosSearch` operator
+8. **Print results** — Display `HotelName` and `score` for top results
+9. **Cleanup** — Drop the collection in a `finally`/`defer` block
+
+### Naming Conventions
+- **Collection names:** `hotels_{algorithm}` — e.g., `hotels_ivf`, `hotels_hnsw`, `hotels_diskann`
+- **Index names:** `vectorIndex_{algorithm}` — e.g., `vectorIndex_ivf`, `vectorIndex_hnsw`, `vectorIndex_diskann`
+- **Database name:** `Hotels` (hardcoded, matches `AZURE_DOCUMENTDB_DATABASENAME` default)
+
+### Standard Search Query
+All samples use the same query text: `"quintessential lodging near running trails, eateries, retail"`
+
+### Vector Search Pipeline Structure
+All languages use the same aggregation pipeline structure:
+```
+[
+  { "$search": { "cosmosSearch": { "vector": <queryEmbedding>, "path": <vectorField>, "k": 5 } } },
+  { "$project": { "score": { "$meta": "searchScore" }, "document": "$$ROOT" } }
+]
+```
+
+> **Note:** `cosmosSearch` is a valid MongoDB API command name for DocumentDB — this is NOT a Cosmos DB reference.

From 0bd759663d7cda43ce861de0b0d873d0712d0cde Mon Sep 17 00:00:00 2001
From: "Dina Berry (She/her)" <diberry@microsoft.com>
Date: Fri, 8 May 2026 10:35:57 -0700
Subject: [PATCH 8/8] =?UTF-8?q?fix:=20address=20review=20feedback=20?=
 =?UTF-8?q?=E2=80=94=20naming,=20scoping,=20clarity?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Add Sample Categories section (quickstart vs agent distinction)
- Add IVF numLists footnote (quickstart=1, agents=10 intentional)
- Clarify .NET appsettings.json + env var override pattern
- Add Rule 11: collection naming convention (hotels_{algorithm})
- Add Rule 12: k=5 for vector search results
- Make DescriptionVector explicit in pipeline template
- Add note that CLI examples apply to all 3 algorithms

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .github/copilot-instructions.md                      | 12 +++++++++---
 .github/instructions/cli-examples.instructions.md    |  2 ++
 .../instructions/execution-patterns.instructions.md  |  2 +-
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index e4abf6e..779dff3 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -17,7 +17,11 @@ ai/
 └── vector-search-agent-typescript/ # TypeScript agent sample (separate from quickstart)
 ```
 
-Each vector-search sample directory contains:
+### Sample Categories
+- **Quickstart samples** (`vector-search-{language}/`): Single algorithm per file, one entry point, uses `MONGO_CLUSTER_NAME` + quickstart env vars
+- **Agent samples** (`vector-search-agent-{language}/`): Multi-LLM orchestration, three entry points (upload/agent/cleanup), uses `AZURE_DOCUMENTDB_*` env vars
+
+Each quickstart sample directory contains:
 - `src/` — Source files: one per algorithm (`ivf`, `hnsw`, `diskann`) + `utils` + `create_embeddings` + `show_indexes`
 - `output/` — Expected output files: `ivf.txt`, `hnsw.txt`, `diskann.txt`
 - `README.md` — Setup, usage, and troubleshooting documentation
@@ -80,7 +84,7 @@ All samples MUST use these environment variable names and defaults:
 ## Consistent Algorithm Parameters
 
 ### IVF
-- numLists: 1
+- numLists: 1 *(quickstart samples; agent samples use `IVF_NUM_LISTS=10` for production-like config)*
 - nProbes: 1
 
 ### HNSW
@@ -100,8 +104,10 @@ All samples MUST use these environment variable names and defaults:
 3. **Data file path from env var.** Code reads `DATA_FILE_WITH_VECTORS` which defaults to `../data/Hotels_Vector.json` (the shared data location). .NET copies data locally to `data/Hotels_Vector.json` in the build output.
 4. **Batch size is LOAD_SIZE_BATCH=100.** Do not use BATCH_SIZE or other variants.
 5. **Database name variable is AZURE_DOCUMENTDB_DATABASENAME.** Do not use MONGO_DB_NAME or other variants.
-6. **.NET uses appsettings.json** with configuration sections: `AzureOpenAI`, `DataFiles`, `Embedding`, `MongoDB`, `VectorSearch`.
+6. **.NET uses appsettings.json** with configuration sections: `AzureOpenAI`, `DataFiles`, `Embedding`, `MongoDB`, `VectorSearch`. Environment variables override config using `Section__Key` format (e.g., `AzureOpenAI__Endpoint`).
 7. **Similarity metric is COS.** All vector index definitions use `"similarity": "COS"` (cosine similarity).
 8. **Output files are committed.** Each sample has an `output/` directory with expected output for each algorithm (`ivf.txt`, `hnsw.txt`, `diskann.txt`). Update these when output format changes.
 9. **DocumentDB supports all index types at any dataset size.** IVF, HNSW, and DiskANN are all available — do not imply tier restrictions limit algorithm availability.
 10. **No dotenv libraries.** Do NOT use `python-dotenv`, `godotenv`, `dotenv` (npm), or any `.env` file-loading library. Environment variables must be passed via the CLI invocation, not loaded from `.env` files at runtime. This keeps samples explicit and avoids hidden configuration.
+11. **Collection naming:** `hotels_{algorithm}` (e.g., `hotels_ivf`, `hotels_hnsw`, `hotels_diskann`). Index naming: `vectorIndex_{algorithm}`.
+12. **Vector search uses k=5.** All samples return top 5 results. Do not parameterize k unless explicitly required.
diff --git a/.github/instructions/cli-examples.instructions.md b/.github/instructions/cli-examples.instructions.md
index a85c2e9..678fba3 100644
--- a/.github/instructions/cli-examples.instructions.md
+++ b/.github/instructions/cli-examples.instructions.md
@@ -5,6 +5,8 @@ applyTo: "ai/**"
 
 Environment variables are passed inline with the run command. Do NOT use `.env` files. Each example below shows the required variables for a vector-search quickstart sample.
 
+> **Note:** Examples show `ivf` but the same pattern applies to all algorithms — replace `ivf` with `hnsw` or `diskann` in file/class names.
+
 ## Go
 
 **Bash:**
diff --git a/.github/instructions/execution-patterns.instructions.md b/.github/instructions/execution-patterns.instructions.md
index 91b1b83..d97db20 100644
--- a/.github/instructions/execution-patterns.instructions.md
+++ b/.github/instructions/execution-patterns.instructions.md
@@ -45,7 +45,7 @@ All samples use the same query text: `"quintessential lodging near running trail
 All languages use the same aggregation pipeline structure:
 ```
 [
-  { "$search": { "cosmosSearch": { "vector": <queryEmbedding>, "path": <vectorField>, "k": 5 } } },
+  { "$search": { "cosmosSearch": { "vector": <queryEmbedding>, "path": "DescriptionVector", "k": 5 } } },
   { "$project": { "score": { "$meta": "searchScore" }, "document": "$$ROOT" } }
 ]
 ```