Skip to content

Conversation

@therealnb
Copy link
Contributor

@therealnb therealnb commented Jan 26, 2026

Add MCP Optimizer Implementation for Semantic Tool Discovery

This PR adds the complete MCP optimizer implementation to vMCP, enabling semantic tool discovery and reducing token usage for LLMs working with large toolsets.

Overview

The optimizer allows vMCP to expose optim.find_tool and optim.call_tool operations instead of all backend tools directly. This reduces token usage by allowing LLMs to discover relevant tools on demand via semantic search rather than receiving all tool definitions upfront.

Features

Core Optimizer Package

Semantic Tool Search (pkg/optimizer/)

  • Vector embeddings (384-dim) for semantic similarity search
  • Full-text search via SQLite FTS5 for BM25 text matching
  • Hybrid search combining semantic and BM25 results (configurable ratio)
  • Multiple embedding backends:
    • Ollama (local HTTP API)
    • OpenAI-compatible (vLLM, OpenAI, etc.)
    • Placeholder (deterministic hash-based, for testing)

Token Counting (pkg/optimizer/tokens/)

  • LLM cost estimation based on token counts
  • Supports monitoring token usage and optimization effectiveness

Database Layer (pkg/optimizer/db/)

  • SQLite-based storage with sqlite-vec for vector similarity search
  • FTS5 for full-text search
  • Hybrid search implementation combining both approaches
  • Persistent and in-memory storage options

Ingestion Service (pkg/optimizer/ingestion/)

  • Ingests tools from all backends in the group
  • Generates embeddings for tool metadata
  • Maintains searchable index of all available tools

vMCP Integration

Optimizer Endpoints (pkg/vmcp/optimizer/)

  • optim.find_tool: Semantic and string-based tool discovery
  • optim.call_tool: Tool invocation with automatic routing
  • Integration with vMCP server lifecycle
  • Comprehensive test coverage (unit, integration, semantic search, string matching)

Server Integration (pkg/vmcp/server/)

  • Optimizer initialization and lifecycle management
  • Session-based optimizer instances
  • Integration with discovery manager and backend registry

Router Updates (pkg/vmcp/router/)

  • Special handling for optim_* prefixed tools
  • Prevents routing optimizer tools to backends (handled by vMCP itself)

Kubernetes Operator Support

Service Resolution (cmd/thv-operator/pkg/vmcpconfig/converter.go)

  • Resolves Kubernetes Service names to URLs for embedding services
  • Handles embeddingServiceembeddingURL conversion
  • Supports in-cluster deployments

CRD Schema (deploy/charts/operator-crds/)

  • Complete optimizer configuration schema
  • Supports all optimizer features (embeddings, persistence, hybrid search)
  • Documentation updates

Configuration

OptimizerConfig (pkg/vmcp/config/config.go)

  • Comprehensive configuration options:
    • enabled: Enable/disable optimizer
    • embeddingBackend: Choose embedding provider
    • embeddingURL: Embedding service URL
    • embeddingModel: Model name for embeddings
    • embeddingDimension: Vector dimension
    • persistPath: Optional persistence path
    • ftsDBPath: FTS5 database path
    • hybridSearchRatio: Semantic vs BM25 mix (0-100%)
    • embeddingService: Kubernetes service name (K8s only)

CLI Integration (cmd/vmcp/app/commands.go)

  • Optimizer configuration parsing from YAML
  • Runtime configuration and initialization
  • Logging and status reporting

Build System

Build Tags (Taskfile.yml)

  • Added -tags="fts5" build flag for SQLite FTS5 support
  • Required for optimizer functionality
  • Applied to all vmcp builds (build, install)

Test Task (Taskfile.yml)

  • Added test-optimizer task for optimizer integration tests
  • Uses sqlite-vec for vector search testing

Examples & Scripts

Example Configuration (examples/vmcp-config-optimizer.yaml)

  • Complete example showing optimizer configuration
  • Demonstrates all configuration options

Helper Scripts (scripts/)

  • test-optimizer-with-sqlite-vec.sh: Integration testing
  • inspect-optimizer-db.sh: Database inspection
  • query-optimizer-db.sh: Query testing
  • Various chromem inspection tools

Documentation

  • Optimizer package documentation in pkg/optimizer/README.md
  • Integration guide in pkg/optimizer/INTEGRATION.md
  • CRD API documentation updates
  • Example configurations

Testing

  • Comprehensive unit tests for all optimizer components
  • Integration tests for optimizer endpoints
  • Semantic search test suite
  • String matching test suite
  • E2E tests for Kubernetes deployments

Dependencies

  • chromem-go: Vector database for embeddings
  • sqlite-vec: SQLite extension for vector similarity search
  • go.uber.org/mock: Mock generation for tests

Build Requirements

  • Requires -tags="fts5" build flag for FTS5 support
  • SQLite with FTS5 extension
  • sqlite-vec extension for vector search

Related

Large PR Justification

  • This is the second part of a two part PR.

@therealnb therealnb requested a review from jerm-dro January 26, 2026 12:05
@therealnb therealnb force-pushed the optimizer-implementation branch from fac6152 to d809cfa Compare January 26, 2026 12:13
@therealnb therealnb changed the base branch from main to optimizer-enablers January 26, 2026 12:13
@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 26, 2026
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@therealnb therealnb force-pushed the optimizer-implementation branch from 8d707ff to 5c0713a Compare January 26, 2026 12:17
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@therealnb therealnb force-pushed the optimizer-implementation branch from 5c0713a to 16bbbfc Compare January 26, 2026 12:25
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@github-actions github-actions bot dismissed their stale review January 26, 2026 15:27

Large PR justification has been provided. Thank you!

@github-actions
Copy link
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 45.54580% with 868 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.49%. Comparing base (e00d514) to head (21f90c6).

Files with missing lines Patch % Lines
pkg/vmcp/optimizer/optimizer.go 10.07% 352 Missing and 5 partials ⚠️
pkg/vmcp/optimizer/internal/ingestion/service.go 0.00% 157 Missing ⚠️
pkg/vmcp/optimizer/internal/db/fts.go 69.30% 48 Missing and 14 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/ollama.go 26.22% 42 Missing and 3 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/manager.go 47.56% 32 Missing and 11 partials ⚠️
pkg/vmcp/optimizer/internal/db/backend_tool.go 63.30% 20 Missing and 20 partials ⚠️
pkg/vmcp/server/server.go 10.00% 23 Missing and 4 partials ⚠️
pkg/vmcp/optimizer/internal/db/hybrid.go 64.86% 17 Missing and 9 partials ⚠️
pkg/vmcp/optimizer/internal/db/db.go 75.58% 17 Missing and 4 partials ⚠️
cmd/vmcp/app/commands.go 0.00% 18 Missing ⚠️
... and 10 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3440      +/-   ##
==========================================
- Coverage   65.15%   64.49%   -0.67%     
==========================================
  Files         398      412      +14     
  Lines       38821    40301    +1480     
==========================================
+ Hits        25295    25992     +697     
- Misses      11564    12266     +702     
- Partials     1962     2043      +81     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
Eliminate the intermediate optimizer.Config type and ConfigFromVMCPConfig
conversion function to use config.OptimizerConfig directly throughout the
optimizer package. This addresses maintainability concerns by establishing
a single source of truth for optimizer configuration.

Changes:
- Delete pkg/vmcp/optimizer/config.go containing the duplicate config type
- Update optimizer.Factory and EmbeddingOptimizer to use *config.OptimizerConfig
- Flatten embedding config in ingestion.Config (individual fields vs nested)
- Add type aliases (Config, OptimizerIntegration) for test compatibility
- Add test helper methods (OnRegisterSession, RegisterTools, IngestToolsForTesting)
- Update all test files to use flattened config structure
- Handle HybridSearchRatio as pointer with default value (70)

Benefits:
- Single source of truth (no config duplication)
- No synchronization burden between config types
- Eliminates risk of translation bugs
- Clearer code flow without intermediate transformations

Closes review comment in PR #3440 requesting removal of translation layers.
@therealnb therealnb force-pushed the optimizer-implementation branch from 53b7e8d to 8d16359 Compare January 28, 2026 11:43
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
Change optimizerIntegration field type from undefined OptimizerIntegration
to optimizer.Optimizer to fix compilation errors.

Fixes:
- undefined: OptimizerIntegration (typecheck)
- E2E test failures
- Linting failures
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
Signed-off-by: nigel brown <nigel@stacklok.com>
@github-actions github-actions bot removed the size/XL Extra large PR: 1000+ lines changed label Jan 28, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
Add nil receiver checks to IngestInitialBackends, OnRegisterSession,
and Close methods to prevent panics when called on nil *EmbeddingOptimizer.

The tests explicitly test nil integration handling, so these methods
must safely handle nil receivers.

Fixes:
- TestClose_NilIntegration panic
- TestIngestInitialBackends_NilIntegration panic
- TestOnRegisterSession_NilIntegration panic
- All related optimizer unit test failures
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
- Fix line length violations (lll) by wrapping long lines
- Remove unused processedSessions field from EmbeddingOptimizer
- Remove unused sync import
- Change unused receivers to _ in convertSearchResults and resolveToolTarget
- Rename unused ctx parameter to _ in NewEmbeddingOptimizer
- Remove unused deserializeServerMetadata, update, and delete functions
- Simplify createTestDatabase to return only Database (not unused embeddingFunc)
- Add nolint directive for OptimizerIntegration type alias (kept for test compatibility)

Fixes all golangci-lint errors:
- lll: 2 line length violations
- revive: 4 unused parameter/receiver issues
- unparam: 1 unused return value
- unused: 4 unused functions/fields
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants