Skip to content

Conversation

@therealnb
Copy link
Contributor

@therealnb therealnb commented Jan 21, 2026

Merge jerm/2026-01-13-optimizer-in-vmcp into main

This PR merges 17 commits that integrate the MCP optimizer into vMCP, adding semantic tool discovery, observability, Kubernetes support, and various bug fixes and improvements.

Core Optimizer Integration

Add Optimizer Package (#3253)

  • Introduced optimizer package - Go port of the mcp-optimizer Python service
  • Semantic tool search using vector embeddings (384-dim)
  • Token counting for LLM cost estimation
  • Full-text search via SQLite FTS5
  • Multiple embedding backends: Ollama, vLLM, or placeholder (testing)
  • Production-ready database with sqlite-vec for vector similarity search

Add Optimizer Integration Endpoints (#3318)

  • Added find_tool and call_tool endpoints to vMCP optimizer
  • Implemented semantic search and string matching for tool discovery
  • Updated optimizer integration documentation
  • Added test scripts for optimizer functionality

Resolve Tool Names in optim.find_tool (#3337)

  • Fixed tool name resolution to match routing table
  • Ensures consistent tool discovery and routing

Observability & Metrics

Add Token Metrics and Observability (#3347)

  • Added comprehensive token metrics to optimizer integration
  • Enables monitoring of token usage and optimization effectiveness

Add OpenTelemetry Tracing to Capability Aggregation

  • Added tracing spans to all aggregator methods for visibility in Jaeger
  • Includes spans for:
    • AggregateCapabilities (parent span)
    • QueryAllCapabilities (parallel backend queries)
    • QueryCapabilities (per-backend queries)
    • ResolveConflicts (conflict resolution)
    • MergeCapabilities (final merge)
  • All spans include relevant attributes like backend counts, tool/resource/prompt counts, and error recording

Kubernetes Integration

Add Dynamic/Static Mode Support (#3235)

  • Added dynamic/static mode support to VirtualMCPServer operator
  • Enables flexible deployment configurations

Add DeepCopy and Kubernetes Service Resolution

  • Used DeepCopy() for automatic passthrough of config fields (Optimizer, Metadata, etc.)
  • Added resolveEmbeddingService() to resolve Kubernetes Service names to URLs
  • Ensures optimizer config is properly converted from CRD to runtime config
  • Resolves embeddingService references in Kubernetes deployments

Kubernetes Optimizer Integration Fixes (#3359)

  • Added CLI fallback for embeddingService when not resolved by operator
  • Normalized localhost to 127.0.0.1 in embeddings to avoid IPv6 issues
  • Added HTTP timeout (30s) to prevent hanging connections
  • Removed WithContinuousListening() to use timeout-based approach

Testing & Reliability Improvements

Run API E2E Test Server as Standalone Process (#3356)

  • Changed test server to run as standalone process instead of in-process
  • Uses full binary to ensure realistic test scenarios

Fix Flaky E2E Tests

  • Add HTTP client timeout to health check (10s timeout)
  • Add pod readiness checks before health endpoint verification
  • Skip completed pods in checkPodsReady to prevent flaky test failures
  • Improved error messages to help diagnose connection reset issues

Fix Unrecognized Dotty Names

  • Fixed issue with unrecognized dotty names in the codebase

Infrastructure

Bump Operator CRDs Chart Version

  • Updated operator-crds chart version to 0.0.97 after rebase

Documentation Updates

  • Updated vmcp/README with optimizer integration information

Summary

This PR consolidates the complete integration of the MCP optimizer into vMCP, enabling semantic tool discovery, reducing token usage, and providing comprehensive observability. The integration includes full Kubernetes support, robust error handling, and improved test reliability.

Related PRs

Large PR Justification

  • Generated code that cannot be split
  • Large new feature that must be atomic
  • Multiple related changes that would break if separated

@therealnb therealnb force-pushed the jerm/2026-01-13-optimizer-in-vmcp branch from 22e020c to 3f3a011 Compare January 21, 2026 15:40
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 21, 2026
@therealnb therealnb mentioned this pull request Jan 21, 2026
@therealnb
Copy link
Contributor Author

Here's the demo scripts I mentioned #3375

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026
@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 27.52747% with 1319 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.19%. Comparing base (ffede88) to head (6d4af12).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
pkg/vmcp/optimizer/optimizer.go 6.38% 436 Missing and 4 partials ⚠️
cmd/thv-operator/pkg/optimizer/db/fts.go 10.89% 175 Missing and 5 partials ⚠️
...md/thv-operator/pkg/optimizer/ingestion/service.go 0.00% 157 Missing ⚠️
cmd/thv-operator/pkg/optimizer/db/backend_tool.go 9.80% 137 Missing and 1 partial ⚠️
...md/thv-operator/pkg/optimizer/db/backend_server.go 12.28% 99 Missing and 1 partial ⚠️
cmd/thv-operator/pkg/optimizer/db/hybrid.go 0.00% 74 Missing ⚠️
...md/thv-operator/pkg/optimizer/embeddings/ollama.go 26.22% 42 Missing and 3 partials ⚠️
...d/thv-operator/pkg/optimizer/embeddings/manager.go 47.56% 32 Missing and 11 partials ⚠️
pkg/vmcp/server/server.go 6.81% 35 Missing and 6 partials ⚠️
cmd/vmcp/app/commands.go 0.00% 28 Missing ⚠️
... and 9 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3373      +/-   ##
==========================================
- Coverage   64.83%   63.19%   -1.65%     
==========================================
  Files         382      398      +16     
  Lines       37219    39033    +1814     
==========================================
+ Hits        24132    24668     +536     
- Misses      11200    12449    +1249     
- Partials     1887     1916      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026
@github-actions github-actions bot dismissed their stale review January 21, 2026 17:19

Large PR justification has been provided. Thank you!

@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 21, 2026
@github-actions
Copy link
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026
jerm-dro and others added 8 commits January 21, 2026 18:02
#3253)

* feat: Add optimizer package with semantic tool discovery and ingestion

This PR introduces the optimizer package, a Go port of the mcp-optimizer Python
service that provides semantic tool discovery and ingestion for MCP servers.

- **Semantic tool search** using vector embeddings (384-dim)
- **Token counting** for LLM cost estimation
- **Full-text search** via SQLite FTS5
- **Multiple embedding backends**: Ollama, vLLM, or placeholder (testing)
- **Production-ready database** with sqlite-vec for vector similarity search
* feat: Add optimizer integration endpoints and tool discovery

- Add find_tool and call_tool endpoints to vmcp optimizer
- Add semantic search and string matching for tool discovery
- Update optimizer integration documentation
- Add test scripts for optimizer functionality
)

* fix: Resolve tool names in optim.find_tool to match routing table
* feat: Add token metrics and observability to optimizer integration
…failures

The checkPodsReady function was checking all pods with matching labels,
including old pods that had completed (Phase: Succeeded) from previous
deployments. This caused the auth discovery e2e test to fail when old
pods were still present during deployment updates.

Fix: Skip pods that are not in Running phase and ensure at least one
running pod exists after filtering.
The test was failing with 'connection reset by peer' errors when trying
to connect to the health endpoint. This can happen if pods crash or
restart between the BeforeAll setup and the actual test execution.

Fix: Add explicit pod readiness verification right before the health check
and also check pod readiness inside the Eventually loop to catch pods
that crash during health check retries. This makes the test more robust
by ensuring pods are stable before attempting HTTP connections.
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 23, 2026
@therealnb therealnb force-pushed the jerm/2026-01-13-optimizer-in-vmcp branch from d7c874e to 0aa6751 Compare January 23, 2026 17:05
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 23, 2026
- Revert server.go to cleaner version with minimal optimizer-specific code
- Create OptimizerIntegration interface that encapsulates all optimizer logic
- Add Initialize() method to handle global tool registration and backend ingestion
- Move optimizer initialization logic behind the interface
- Add per-backend ingestion spans for better observability
- Create helper function for config conversion to maintain backward compatibility

This refactoring makes the optimizer integration fully self-contained and modular,
with server.go acting as a thin orchestration layer.
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 23, 2026
The OnRegisterSession method was removed during refactoring but is still
used by tests. Added it back as a legacy method that does nothing since
ingestion is now handled by Initialize(). This maintains backward
compatibility with existing tests while the new HandleSessionRegistration
method is used in production code.
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 23, 2026
Run gofmt and goimports to fix formatting issues.
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 23, 2026
- Move OptimizerIntegration interface from server.go to optimizer package
- Rename interface to Integration for clarity within optimizer package
- Update server.go to import and use optimizer.Integration
- Fix unused parameter lint error (rename ctx to _)
- Add compile-time interface implementation check
The OptimizerIntegration interface was moved to the optimizer package,
so mockgen no longer generates it in server.go mocks.
… package

- Remove duplicate OptimizerConfig type from server.go
- Create ConfigFromVMCPConfig helper in optimizer package for conversion
- Update CLI to use optimizer.Config directly via conversion helper
- Remove createOptimizerIntegrationFromConfig helper function
- Remove unused embeddings import from server.go

This eliminates unnecessary duplication and improves separation of concerns.
The optimizer package now owns the conversion logic from config types.
…Reporter

- Move optimizer initialization from New() to Start() to match main branch structure
- Restore StatusReporter functionality (was removed to match main, but needed for operator)
- Fix optimizer_test.go to use optimizer.Config instead of removed server.OptimizerConfig
- Update test configs to use EmbeddingConfig structure

This makes server.go structure closer to main while maintaining both optimizer
and StatusReporter functionality.
@therealnb
Copy link
Contributor Author

This has been moved to two component PRs #3439 and #3440.

@therealnb therealnb closed this Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants