Sirchmunk: Raw data to self-evolving intelligence, real-time.

Quick Start · Key Features · Web UI · How it Works · FAQ

🔍 Agentic Search • 🧠 Knowledge Clustering • 📊 Monte Carlo Evidence Sampling
⚡ Indexless Retrieval • 🔄 Self-Evolving Knowledge Base • 💬 Real-time Chat

🌰 Why “Sirchmunk”？

Intelligence pipelines built upon vector-based retrieval can be rigid and brittle. They rely on static vector embeddings that are expensive to compute, blind to real-time changes, and detached from the raw context. We introduce Sirchmunk to usher in a more agile paradigm, where data is no longer treated as a snapshot, and insights can evolve together with the data.

✨ Key Features

1. Embedding-Free: Data in its Purest Form

Sirchmunk works directly with raw data -- bypassing the heavy overhead of squeezing your rich files into fixed-dimensional vectors.

Instant Search: Eliminating complex pre-processing pipelines in hours long indexing; just drop your files and search immediately.
Full Fidelity: Zero information loss —- stay true to your data without vector approximation.

2. Self-Evolving: A Living Index

Data is a stream, not a snapshot. Sirchmunk is dynamic by design, while vector DB can become obsolete the moment your data changes.

Context-Aware: Evolves in real-time with your data context.
LLM-Powered Autonomy: Designed for Agents that perceive data as it lives, utilizing token-efficient reasoning that triggers LLM inference only when necessary to maximize intelligence while minimizing cost.

3. Intelligence at Scale: Real-Time & Massive

Sirchmunk bridges massive local repositories and the web with high-scale throughput and real-time awareness.
It serves as a unified intelligent hub for AI agents, delivering deep insights across vast datasets at the speed of thought.

Traditional RAG vs. Sirchmunk

Dimension	Traditional RAG	✨Sirchmunk
💰 Setup Cost	High Overhead (VectorDB, GraphDB, Complex Document Parser...)	✅ Zero Infrastructure Direct-to-data retrieval without vector silos
🕒 Data Freshness	Stale (Batch Re-indexing)	✅ Instant & Dynamic Self-evolving index that reflects live changes
📈 Scalability	Linear Cost Growth	✅ Extremely low RAM/CPU consumption Native Elastic Support, efficiently handles large-scale datasets
🎯 Accuracy	Approximate Vector Matches	✅ Deterministic & Contextual Hybrid logic ensuring semantic precision
⚙️ Workflow	Complex ETL Pipelines	✅ Drop-and-Search Zero-config integration for rapid deployment

Demonstration

Access files directly to start chatting

🎉 News

🚀 Feb 5, 2026: Release v0.0.2 — MCP Support, CLI Commands & Knowledge Persistence!
- MCP Integration: Full Model Context Protocol support, works seamlessly with Claude Desktop and Cursor IDE.
- CLI Commands: New sirchmunk CLI with init, serve, search, web, and mcp commands.
- KnowledgeCluster Persistence: DuckDB-powered storage with Parquet export for efficient knowledge management.
- Knowledge Reuse: Semantic similarity-based cluster retrieval for faster searches via embedding vectors.
🎉🎉 Jan 22, 2026: Introducing Sirchmunk: Initial Release v0.0.1 Now Available!

🚀 Quick Start

Prerequisites

Python 3.10+
LLM API Key (OpenAI-compatible endpoint, local or remote)
Node.js 18+ (Optional, for web interface)

Installation

# Create virtual environment (recommended)
conda create -n sirchmunk python=3.13 -y && conda activate sirchmunk 

pip install sirchmunk

# Or via UV:
uv pip install sirchmunk

# Alternatively, install from source:
git clone https://github.com/modelscope/sirchmunk.git && cd sirchmunk
pip install -e .

Python SDK Usage

import asyncio

from sirchmunk import AgenticSearch
from sirchmunk.llm import OpenAIChat

llm = OpenAIChat(
        api_key="your-api-key",
        base_url="your-base-url",   # e.g., https://api.openai.com/v1
        model="your-model-name"     # e.g., gpt-4o
    )

async def main():
    
    searcher = AgenticSearch(llm=llm)
    
    result: str = await searcher.search(
        query="How does transformer attention work?",
        paths=["/path/to/documents"],
    )
    
    print(result)

asyncio.run(main())

⚠️ Notes:

Upon initialization, AgenticSearch automatically checks if ripgrep-all and ripgrep are installed. If they are missing, it will attempt to install them automatically. If the automatic installation fails, please install them manually.
- References: https://github.com/BurntSushi/ripgrep | https://github.com/phiresky/ripgrep-all
Replace "your-api-key", "your-base-url", "your-model-name" and /path/to/documents with your actual values.

Command Line Interface

Sirchmunk provides a powerful CLI for server management and search operations.

Installation

pip install "sirchmunk[web]"

# or install via UV
uv pip install "sirchmunk[web]"

Initialize

# Initialize Sirchmunk with default settings (Default work path: `~/.sirchmunk/`)
sirchmunk init

# Alternatively, initialize with custom work path
sirchmunk init --work-path /path/to/workspace

Start Server

# Start backend API server only
sirchmunk serve

# Custom host and port
sirchmunk serve --host 0.0.0.0 --port 8000

Search

# Search in current directory
sirchmunk search "How does authentication work?"

# Search in specific paths
sirchmunk search "find all API endpoints" ./src ./docs

# Quick filename search
sirchmunk search "config" --mode FILENAME_ONLY

# Output as JSON
sirchmunk search "database schema" --output json

# Use API server (requires running server)
sirchmunk search "query" --api --api-url http://localhost:8584

Available Commands

Command	Description
`sirchmunk init`	Initialize working directory, .env, and MCP config
`sirchmunk serve`	Start the backend API server
`sirchmunk search`	Perform search queries
`sirchmunk web init`	Build WebUI frontend (requires Node.js 18+)
`sirchmunk web serve`	Start API + WebUI (single port)
`sirchmunk web serve --dev`	Start API + Next.js dev server (hot-reload)
`sirchmunk mcp serve`	Start the MCP server (stdio/HTTP)
`sirchmunk mcp version`	Show MCP version information
`sirchmunk version`	Show version information

🔌 MCP Server

Sirchmunk provides a Model Context Protocol (MCP) server that exposes its intelligent search capabilities as MCP tools. This enables seamless integration with AI assistants like Claude Desktop and Cursor IDE.

Quick Start

# Install with MCP support
pip install sirchmunk[mcp]

# Initialize (generates .env and mcp_config.json)
sirchmunk init

# Edit ~/.sirchmunk/.env with your LLM API key

# Test with MCP Inspector
npx @modelcontextprotocol/inspector sirchmunk mcp serve

`mcp_config.json` Configuration

After running sirchmunk init, a ~/.sirchmunk/mcp_config.json file is generated. Copy it to your MCP client configuration directory.

Example:

{
  "mcpServers": {
    "sirchmunk": {
      "command": "sirchmunk",
      "args": ["mcp", "serve"],
      "env": {
        "SIRCHMUNK_SEARCH_PATHS": "/path/to/your_docs,/another/path"
      }
    }
  }
}

Parameter	Description
`command`	The command to start the MCP server. Use full path (e.g. `/path/to/venv/bin/sirchmunk`) if running in a virtual environment.
`args`	Command arguments. `["mcp", "serve"]` starts the MCP server in stdio mode.
`env.SIRCHMUNK_SEARCH_PATHS`	Default document search directories (comma-separated). Supports both English `,` and Chinese `，` as delimiters. When set, these paths are used as default if no `paths` parameter is provided during tool invocation.

Tip: MCP Inspector is a great way to test the integration before connecting to your AI assistant. In MCP Inspector: Connect → Tools → List Tools → sirchmunk_search → Input parameters (query and paths, e.g. ["/path/to/your_docs"]) → Run Tool.

Features

Multi-Mode Search: DEEP mode for comprehensive analysis, FILENAME_ONLY for fast file discovery
Knowledge Cluster Management: Automatic extraction, storage, and reuse of knowledge
Standard MCP Protocol: Works with stdio and Streamable HTTP transports

📖 For detailed documentation, see Sirchmunk MCP README.

🖥️ Web UI

The web UI is built for fast, transparent workflows: chat, knowledge analytics, and system monitoring in one place.

_{Home — Chat with streaming logs, file-based RAG, and session management.}

_{Monitor — System health, chat activity, knowledge analytics, and LLM usage.}

Option 1: Single-Port Mode (Recommended)

Build the frontend once, then serve everything from a single port — no Node.js needed at runtime.

# Build WebUI frontend (requires Node.js 18+ at build time)
sirchmunk web init

# Start server with embedded WebUI
sirchmunk web serve

Access: http://localhost:8584 (API + WebUI on the same port)

Option 2: Development Mode

For frontend development with hot-reload:

# Start backend + Next.js dev server
sirchmunk web serve --dev

Access:

Frontend (hot-reload): http://localhost:8585
Backend APIs: http://localhost:8584/docs

Option 3: Legacy Script

# Start frontend and backend via script
python scripts/start_web.py 

# Stop all services
python scripts/stop_web.py

Configuration:

Access Settings → Envrionment Variables to configure LLM API, and other parameters.

🏗️ How it Works

Sirchmunk Framework

Core Components

Component	Description
AgenticSearch	Search orchestrator with LLM-enhanced retrieval capabilities
KnowledgeBase	Transforms raw results into structured knowledge clusters with evidences
EvidenceProcessor	Evidence processing based on the MonteCarlo Importance Sampling
GrepRetriever	High-performance indexless file search with parallel processing
OpenAIChat	Unified LLM interface supporting streaming and usage tracking
MonitorTracker	Real-time system and application metrics collection

Data Storage

All persistent data is stored in the configured SIRCHMUNK_WORK_PATH (default: ~/.sirchmunk/):

{SIRCHMUNK_WORK_PATH}/
  ├── .cache/
    ├── history/              # Chat session history (DuckDB)
    │   └── chat_history.db
    ├── knowledge/            # Knowledge clusters (Parquet)
    │   └── knowledge_clusters.parquet
    └── settings/             # User settings (DuckDB)
        └── settings.db

🔗 HTTP Client Access (Search API)

When the server is running (sirchmunk serve or sirchmunk web serve), the Search API is accessible via any HTTP client.

API Endpoints

Method	Endpoint	Description
`POST`	`/api/v1/search`	Execute a search query
`GET`	`/api/v1/search/status`	Check server and LLM configuration status

Interactive Docs: http://localhost:8584/docs (Swagger UI)

cURL Examples

# Basic search (DEEP mode)
curl -X POST http://localhost:8584/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How does authentication work?",
    "paths": ["/path/to/project"],
    "mode": "DEEP"
  }'

# Filename search (fast, no LLM required)
curl -X POST http://localhost:8584/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "config",
    "paths": ["/path/to/project"],
    "mode": "FILENAME_ONLY"
  }'

# Full parameters
curl -X POST http://localhost:8584/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "database connection pooling",
    "paths": ["/path/to/project/src"],
    "mode": "DEEP",
    "max_depth": 10,
    "top_k_files": 20,
    "keyword_levels": 3,
    "include_patterns": ["*.py", "*.java"],
    "exclude_patterns": ["*test*", "*__pycache__*"],
    "return_cluster": true
  }'

# Check server status
curl http://localhost:8584/api/v1/search/status

Python Client Examples

Using requests:

import requests

response = requests.post(
    "http://localhost:8584/api/v1/search",
    json={
        "query": "How does authentication work?",
        "paths": ["/path/to/project"],
        "mode": "DEEP"
    },
    timeout=300  # DEEP mode may take a while
)

data = response.json()
if data["success"]:
    print(data["data"]["result"])

Using httpx (async):

import httpx
import asyncio

async def search():
    async with httpx.AsyncClient(timeout=300) as client:
        resp = await client.post(
            "http://localhost:8584/api/v1/search",
            json={
                "query": "find all API endpoints",
                "paths": ["/path/to/project"],
                "mode": "DEEP"
            }
        )
        data = resp.json()
        print(data["data"]["result"])

asyncio.run(search())

JavaScript Client Example

const response = await fetch("http://localhost:8584/api/v1/search", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    query: "How does authentication work?",
    paths: ["/path/to/project"],
    mode: "DEEP"
  })
});

const data = await response.json();
if (data.success) {
  console.log(data.data.result);
}

Request Parameters

Parameter	Type	Default	Description
`query`	`string`	required	Search query or question
`paths`	`string[]`	required	Directories or files to search (min 1)
`mode`	`string`	`"DEEP"`	`DEEP` or `FILENAME_ONLY`
`max_depth`	`int`	`null`	Maximum directory depth
`top_k_files`	`int`	`null`	Number of top files to return
`keyword_levels`	`int`	`null`	Keyword granularity levels
`include_patterns`	`string[]`	`null`	File glob patterns to include
`exclude_patterns`	`string[]`	`null`	File glob patterns to exclude
`return_cluster`	`bool`	`false`	Return full KnowledgeCluster object

Note: FILENAME_ONLY mode does not require an LLM API key. DEEP mode requires a configured LLM.

❓ FAQ

How is this different from traditional RAG systems?

Sirchmunk takes an indexless approach:

No pre-indexing: Direct file search without vector database setup
Self-evolving: Knowledge clusters evolve based on search patterns
Multi-level retrieval: Adaptive keyword granularity for better recall
Evidence-based: Monte Carlo sampling for precise content extraction

What LLM providers are supported?

Any OpenAI-compatible API endpoint, including (but not limited too):

OpenAI (GPT-4, GPT-4o, GPT-3.5)
Local models served via Ollama, llama.cpp, vLLM, SGLang etc.
Claude via API proxy

How do I add documents to search?

Simply specify the path in your search query:

result = await searcher.search(
    query="Your question",
    paths=["/path/to/folder", "/path/to/file.pdf"]
)

No pre-processing or indexing required!

Where are knowledge clusters stored?

Knowledge clusters are persisted in Parquet format at:

{SIRCHMUNK_WORK_PATH}/.cache/knowledge/knowledge_clusters.parquet

You can query them using DuckDB or the KnowledgeManager API.

How do I monitor LLM token usage?

Web Dashboard: Visit the Monitor page for real-time statistics
API: GET /api/v1/monitor/llm returns usage metrics
Code: Access searcher.llm_usages after search completion

📋 Roadmap

Text-retrieval from raw files
Knowledge structuring & persistence
Real-time chat with RAG
Web UI support
Web search integration
Multi-modal support (images, videos)
Distributed search across nodes
Knowledge visualization and deep analytics
More file type support

🤝 Contributing

We welcome contributions !

📄 License

This project is licensed under the Apache License 2.0.

ModelScope · ⭐ Star us · 🐛 Report a bug · 💬 Discussions

✨ Sirchmunk: Raw data to self-evolving intelligence, real-time.

❤️ Thanks for Visiting ✨ Sirchmunk !

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
.github/workflows		.github/workflows
assets		assets
config		config
requirements		requirements
scripts		scripts
src		src
web		web
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

modelscope/sirchmunk

Folders and files

Latest commit

History

Repository files navigation

Sirchmunk: Raw data to self-evolving intelligence, real-time.

🌰 Why “Sirchmunk”？

✨ Key Features

1. Embedding-Free: Data in its Purest Form

2. Self-Evolving: A Living Index

3. Intelligence at Scale: Real-Time & Massive

Traditional RAG vs. Sirchmunk

Demonstration

🎉 News

🚀 Quick Start

Prerequisites

Installation

Python SDK Usage

Command Line Interface

Installation

Initialize

Start Server

Search

Available Commands

🔌 MCP Server

Quick Start

mcp_config.json Configuration

Features

🖥️ Web UI

Option 1: Single-Port Mode (Recommended)

Option 2: Development Mode

Option 3: Legacy Script

🏗️ How it Works

Sirchmunk Framework

Core Components

Data Storage

🔗 HTTP Client Access (Search API)

❓ FAQ

📋 Roadmap

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`mcp_config.json` Configuration

Packages