CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
-
Updated
Mar 13, 2026 - Rust
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
🦞 LLM Token Compression & Reduction Tool — Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
The Context Optimization Layer for LLM Applications
Working memory for Claude Code - persistent context and multi-instance coordination
An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.
Production-ready modular Claude Code framework with 30+ commands, token optimization, and MCP server integration. Achieves 2-10x productivity gains through systematic command organization and hierarchical configuration.
TOON encoding for Laravel. Encode data for AI/LLMs with ~50% fewer tokens than JSON.
Config-driven CLI tool that compresses command output before it reaches an LLM context
Find the ghost tokens. Audit your Claude Code context window overhead, see where tokens go, get them back.
Automatic prompt caching for Claude Code. Cuts token costs by up to 90% on repeated file reads, bug fix sessions, and long coding conversations - zero config.
TOON — Laravel AI package for compact, human-readable, token-efficient data format with JSON ⇄ TOON conversion for ChatGPT, OpenAI, and other LLM prompts.
Multi-agent orchestration for Claude Code with 15-30% token optimization, self-improving agents, and automatic verification
OCTAVE protocol - structured AI communication with 3-20x token reduction. MCP server with lenient-to-canonical pipeline and schema validation.
The central goal is the 5-line API a complete, production-ready endpoint with auto-generated OpenAPI documentation, compiled validation, and distributed tracing should require no more boilerplate than the handler logic itself.
Security hooks and monitoring for Claude Code — quiet overrides, SSRF protection, MCP compression, OTEL tracing
🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.
Claude Code plugin that offloads large outputs to filesystem and retrieves when required.
Context Limiter & Output Vetter for context bloat. It is a highly specialized, structure-aware JSON built specifically to intercept and compress MCP responses before they annihilate your LLM's context window.
Laravel integration for TOON format: encode/decode JSON data into a token-optimized notation for LLMs.
Add a description, image, and links to the token-optimization topic page so that developers can more easily learn about it.
To associate your repository with the token-optimization topic, visit your repo's landing page and select "manage topics."