A step-by-step guide to install, configure, and use codebase-memory-mcp — a code knowledge graph MCP server that indexes your codebase into a queryable graph of functions, classes, modules, and their relationships.
codebase-memory-mcp parses your source code and builds a graph database containing:
- Nodes: Functions, Classes, Modules, Methods, Interfaces, Routes, Files, Packages
- Edges: CALLS, HTTP_CALLS, ASYNC_CALLS, IMPORTS, DEFINES, IMPLEMENTS, OVERRIDE, USAGE, FILE_CHANGES_WITH
This lets Claude navigate code by relationships (who calls what, what implements what, blast radius of changes) instead of reading entire files. Key capabilities:
- Architecture overview — language breakdown, hotspots, entry points, routes, cross-service boundaries
- Code search — find functions/classes by name pattern, filter by degree (fan-in/fan-out), dead code detection
- Call tracing — trace call paths inbound/outbound with hop-by-hop detail
- Code snippets — fetch individual function/class source with metadata (complexity, callers, callees)
- Change detection — map git diffs to affected graph symbols and blast radius
- Architecture Decision Records — persistent, section-based architectural summaries
- Cypher queries — arbitrary graph queries for complex relationship patterns
codebase-memory-mcp is a Node.js package. Install it globally with npm:
npm install -g codebase-memory-mcpVerify it works:
codebase-memory-mcp --helpAdd to .mcp.json in your project root:
{
"mcpServers": {
"codebase-memory-mcp": {
"command": "codebase-memory-mcp",
"args": [],
"type": "stdio"
}
}
}This makes the tools available whenever Claude Code opens this project.
claude mcp add codebase-memory-mcp -- codebase-memory-mcpOr add manually to ~/.claude/settings.json under mcpServers.
Claude Code needs permission to use each MCP tool. Add these to your project's .claude/settings.local.json under allowedTools:
mcp__codebase-memory-mcp__index_repository
mcp__codebase-memory-mcp__index_status
mcp__codebase-memory-mcp__list_projects
mcp__codebase-memory-mcp__get_architecture
mcp__codebase-memory-mcp__get_graph_schema
mcp__codebase-memory-mcp__search_graph
mcp__codebase-memory-mcp__search_code
mcp__codebase-memory-mcp__query_graph
mcp__codebase-memory-mcp__get_code_snippet
mcp__codebase-memory-mcp__trace_call_path
mcp__codebase-memory-mcp__detect_changes
mcp__codebase-memory-mcp__manage_adr
mcp__codebase-memory-mcp__ingest_traces
mcp__codebase-memory-mcp__delete_project
Without these, Claude will ask for permission on every single tool call.
Add a SessionStart hook to ~/.claude/settings.json so the index is always fresh:
{
"hooks": {
"SessionStart": [
{
"hooks": [
{
"type": "prompt",
"prompt": "If codebase-memory-mcp tools are available (mcp__codebase-memory-mcp__*), run mcp__codebase-memory-mcp__index_repository to ensure the code graph is current. Incremental indexing skips unchanged files, so this is fast when already indexed. If the server is not available, skip silently."
}
]
}
]
}
}How it works: Prompt-type hooks inject instructions into Claude's context at session start. The "If available" phrasing means it's a no-op in projects that don't have the MCP server. Incremental indexing via content hashing means only changed files are re-parsed.
The hooks and MCP config make the tools available. The CLAUDE.md rules tell Claude when to prefer them. Add this to your project or global ~/.claude/CLAUDE.md:
## Code Knowledge Graph — codebase-memory-mcp (when available)
When codebase-memory-mcp tools (`mcp__codebase-memory-mcp__*`) are available, use them as the
**primary tool for code navigation and understanding**.
### Rules
- **Orientation first**: Use `get_architecture` when exploring an unfamiliar codebase or area —
it provides language breakdown, hotspots, entry points, routes, and cross-service boundaries
- **Search by name**: Use `search_graph` instead of `Grep` when looking for function/class
definitions — it returns connectivity (callers/callees) and supports regex patterns
- **Fetch specific code**: Use `get_code_snippet` to retrieve individual functions/classes with
metadata — avoids reading entire files
- **Trace relationships**: Use `trace_call_path` to understand who calls a function and what it
calls — essential before refactoring
- **Blast radius**: Use `detect_changes` before committing to see which symbols are affected by
your git changes and their risk classification
- **Text search**: Use `search_code` for string literals, error messages, TODO comments, and
config values that aren't in the graph as named symbols
- **Complex queries**: Use `query_graph` with Cypher for relationship patterns, edge property
filtering, and cross-service HTTP/async links
- **Keep index fresh**: Run `index_repository` at session start and after large batch edits.
The server auto-syncs after initial indexing
- **ADR**: Use `manage_adr` to maintain Architecture Decision Records — fetch before planning
to validate against ARCHITECTURE, PATTERNS, STACK, and PHILOSOPHY sections
### When Read is correct
- Non-code files (JSON, YAML, config, HTML templates)
- Full file context needed (imports, globals, module-level flow)
- Very small files (<50 lines)
- Files not yet indexed (newly created before next `index_repository`)
- Editing many functions in the same file (batch edit — full Read is cheaper)Why this matters: Without these rules, Claude defaults to
Readfor everything. The rules make the knowledge graph the default for code navigation, withReadas the exception.
Rules in CLAUDE.md are instructions — Claude should follow them, but sometimes doesn't. Hooks provide runtime enforcement.
This hook fires every time Claude tries to use Read on a source code file, injecting a reminder to use the graph tools instead.
Create ~/.claude/hooks/codebase-memory-nudge.sh:
#!/bin/bash
# PreToolUse hook: nudge toward codebase-memory-mcp when Read is used on code files
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('file_path',''))" 2>/dev/null)
# Only nudge for common source code files
case "$FILE_PATH" in
*.py|*.ts|*.tsx|*.js|*.jsx|*.go|*.rs|*.java|*.rb|*.pl|*.pm|*.cgi)
BASENAME=$(basename "$FILE_PATH")
echo "codebase-memory reminder: Consider using get_code_snippet or search_graph for '$BASENAME' instead of Read. Use Read only if you need full file context."
;;
esacchmod +x ~/.claude/hooks/codebase-memory-nudge.shRegister in ~/.claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Read",
"hooks": [
{
"type": "command",
"command": "bash \"$HOME/.claude/hooks/codebase-memory-nudge.sh\""
}
]
}
]
}
}Create ~/.claude/hooks/reindex-after-edit.sh:
#!/bin/bash
# PostToolUse:Write|Edit — remind Claude to re-index after code changes
INPUT=$(cat)
FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_input.path // empty')
[ -z "$FILE" ] && exit 0
# Only trigger for source code file types
case "$FILE" in
*.py|*.ts|*.tsx|*.js|*.jsx|*.go|*.rs|*.java|*.rb|*.pl|*.pm|*.cgi) ;;
*) exit 0 ;;
esac
# Debounce: skip if we re-indexed within the last 60 seconds
STAMP="/tmp/cbm-reindex-stamp-$(id -u)"
if [ -f "$STAMP" ]; then
LAST=$(stat -f %m "$STAMP" 2>/dev/null || stat -c %Y "$STAMP" 2>/dev/null || echo 0)
NOW=$(date +%s)
[ $((NOW - LAST)) -lt 60 ] && exit 0
fi
touch "$STAMP"
echo "Source file modified. Consider running index_repository to keep the code graph fresh."chmod +x ~/.claude/hooks/reindex-after-edit.shRegister in ~/.claude/settings.json:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "bash \"$HOME/.claude/hooks/reindex-after-edit.sh\""
}
]
}
]
}
}| Tool | Purpose |
|---|---|
index_repository |
Parse source files and build/refresh the code graph. Supports mode='fast' for large repos (>50K files). Incremental via content hashing. |
index_status |
Check if project is indexed, currently indexing, or not found. Shows node/edge counts. |
list_projects |
List all indexed projects with timestamps and counts. |
delete_project |
Remove a project's graph data. Irreversible. |
| Tool | Purpose |
|---|---|
get_architecture |
Structural overview: languages, packages, entry points, routes, hotspots, boundaries, clusters, layers, file tree, ADR. Call first on unfamiliar codebases. |
search_graph |
Find functions/classes/modules by name pattern. Filter by label, degree, relationship type. Case-insensitive regex. Paginated (10/page). |
search_code |
Grep-like text search scoped to indexed project. For string literals, TODOs, config values. Paginated. |
get_code_snippet |
Fetch source code for a specific function/class by name. Returns signature, complexity, decorators, docstring, caller/callee counts. |
trace_call_path |
BFS traversal of call graph. Who calls it (inbound), what it calls (outbound), or both. Hop-by-hop with edge types. |
query_graph |
Cypher queries for complex patterns. Edge property filtering, cross-service links, change coupling. 200-row cap. |
get_graph_schema |
Node labels, edge types, relationship patterns, sample names. Understand graph structure before querying. |
| Tool | Purpose |
|---|---|
detect_changes |
Map git diffs to affected graph symbols + blast radius. Risk classification: CRITICAL (hop 1) → LOW (hop 4+). |
manage_adr |
CRUD for Architecture Decision Records. 6 fixed sections: PURPOSE, STACK, ARCHITECTURE, PATTERNS, TRADEOFFS, PHILOSOPHY. |
ingest_traces |
Validate HTTP_CALLS edges with OpenTelemetry traces. Boosts confidence scores on matched edges. |
index_repository → get_architecture(aspects=['all']) → search_graph for key areas
search_graph(name_pattern='.*Order.*') → trace_call_path('processOrder') → get_code_snippet('myapp.services.order.processOrder')
detect_changes(scope='staged', depth=3) → review CRITICAL/HIGH risk symbols
search_graph(relationship='CALLS', direction='inbound', max_degree=0, exclude_entry_points=true)
query_graph("MATCH (a)-[r:HTTP_CALLS]->(b) RETURN a.name, b.name, r.url_path, r.confidence_band LIMIT 20")
Here's the full picture of all hooks, where they live, and what they do:
| Event | Matcher | Script | Type | Effect |
|---|---|---|---|---|
| SessionStart | — | (prompt) | prompt | Checks index status and runs index_repository if needed |
| PreToolUse | Read | codebase-memory-nudge.sh |
command | Non-blocking reminder for source code files |
| PostToolUse | Write|Edit | reindex-after-edit.sh |
command | Prompts re-index after source file changes (debounced 60s) |
The project-level hooks provide stronger enforcement and agent-aware initialization.
Copy them from hooks/project/ using setup.sh or manually.
| Event | Matcher | Script | Type | Effect |
|---|---|---|---|---|
| SessionStart | — | cmm-session-start.sh |
command | Resets CMM sentinel; injects rich init prompt for spawned agents, minimal prompt for human sessions |
| PreToolUse | * | cmm-session-gate.sh |
command | Blocks all tools until CMM sentinel exists; allows indexing tools, ToolSearch, and SendMessage through |
| PreToolUse | Agent | agent-cmm-gate.sh |
command | Blocks Agent tool calls that don't reference CMM keywords; exempts VBW agents with a context note |
When Claude spawns a sub-agent, the sub-agent starts a new session. Here is what happens:
cmm-session-start.shfires (SessionStart) — detects$CLAUDE_AGENT_IDor$CLAUDE_PARENT_SESSION_ID, deletes the stale sentinel, and injects the rich agent prompt.- The agent's first tool call is blocked by
cmm-session-gate.shunless it is one of the allow-listed tools:index_repository,index_status,delete_project,ToolSearch,SendMessage. - The agent runs
index_status(orindex_repository) —cmm-sentinel-writer.shwrites the sentinel on success. cmm-session-gate.shpasses all subsequent tool calls through.- The agent reads
.vbw-planning/STATE.mdto find its active phase/plan and proceeds with its task using CMM tools.
If the CMM server is unavailable, the agent (or user) can create the sentinel manually:
touch "/tmp/cmm-session-ready-$(echo "$PROJECT_ROOT" | md5 -q 2>/dev/null || echo "$PROJECT_ROOT" | md5sum | cut -d' ' -f1)"Session starts
→ SessionStart prompt checks index_status
→ Runs index_repository if needed (incremental — only changed files)
Claude needs a function
→ Tries Read on .py file
→ codebase-memory-nudge.sh fires: "Use get_code_snippet or search_graph instead"
→ Claude uses search_graph → get_code_snippet instead
→ Gets source code + metadata without reading the entire file
Claude needs to understand impact
→ detect_changes maps git diff to graph symbols
→ Returns blast radius with risk classification per hop
Claude edits a file
→ reindex-after-edit.sh fires (debounced 60s)
→ Prompts Claude to re-run index_repository
"codebase-memory-mcp: command not found"
- Ensure the package is installed globally:
npm install -g codebase-memory-mcp - Verify
$(npm prefix -g)/binis in your PATH
Index status shows "not found"
- Run
index_repositorywith the repo path:index_repository(repo_path='/path/to/project')
search_graph returns no results
- Check
index_statusto confirm indexing completed - Use
get_graph_schemato see what node labels and edge types exist - Try broader regex patterns with alternatives:
'handler|hdlr|ctrl'
query_graph undercounts with COUNT
- The 200-row cap applies BEFORE aggregation. Use
search_graphwithmin_degree/max_degreefor accurate counting.
detect_changes shows no affected symbols
- Ensure git is in PATH and the project has been indexed
- Check that changed files contain supported source code (not just config/docs)