Skip to content

feat: enhance caching mechanisms to prevent memory leaks #847

Merged
MODSetter merged 2 commits intomainfrom
dev
Feb 28, 2026
Merged

feat: enhance caching mechanisms to prevent memory leaks #847
MODSetter merged 2 commits intomainfrom
dev

Conversation

@MODSetter
Copy link
Owner

@MODSetter MODSetter commented Feb 28, 2026

Description

Motivation and Context

FIX #

Screenshots

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR enhances caching mechanisms across the application to prevent memory leaks and reduce memory growth. The core changes introduce a cached singleton pattern for ChatLiteLLMRouter instances to avoid repeated initialization overhead, add size limits and eviction strategies to various caches (MCP tools cache, sandbox cache, rate limit cache), and trigger explicit garbage collection after chat stream completion to reclaim objects with circular references. The get_auto_mode_llm() function now returns cached router instances keyed by streaming mode, and the router's context profile computation is cached globally to avoid repeated model info lookups.

⏱️ Estimated Review Time: 30-90 minutes

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/app/services/llm_router_service.py
2 surfsense_backend/app/services/llm_service.py
3 surfsense_backend/app/agents/new_chat/llm_config.py
4 surfsense_backend/app/agents/new_chat/tools/mcp_tool.py
5 surfsense_backend/app/agents/new_chat/sandbox.py
6 surfsense_backend/app/app.py
7 surfsense_backend/app/tasks/chat/stream_new_chat.py

Need help? Join our Discord

Analyze latest changes

- Improved in-memory rate limiting by evicting timestamps outside the current window and cleaning up empty keys.
- Updated LLM router service to cache context profiles and avoid redundant computations.
- Introduced cache eviction logic for MCP tools and sandbox instances to manage memory usage effectively.
- Added garbage collection triggers in chat streaming functions to reclaim resources promptly.
@vercel
Copy link

vercel bot commented Feb 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
surf-sense-frontend Building Building Preview, Comment Feb 28, 2026 1:57am

Request Review

@MODSetter MODSetter merged commit 0a28014 into main Feb 28, 2026
4 of 6 checks passed
Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on 4105bd0..cc0d8ad

✨ No bugs found, your code is sparkling clean

✅ Files analyzed, no issues (7)

surfsense_backend/app/agents/new_chat/llm_config.py
surfsense_backend/app/agents/new_chat/sandbox.py
surfsense_backend/app/agents/new_chat/tools/mcp_tool.py
surfsense_backend/app/app.py
surfsense_backend/app/services/llm_router_service.py
surfsense_backend/app/services/llm_service.py
surfsense_backend/app/tasks/chat/stream_new_chat.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant