Skip to content

Commit e58aff1

Browse files
docs: add sizing guide for Sourcebot deployments (#923)
* docs: add sizing guide for Sourcebot deployments Add comprehensive sizing recommendations based on real-world customer deployments. Includes guidance on CPU, memory, and disk allocation across deployment tiers, disk usage calculation, and monitoring strategies. Also updates the overview to link to the new guide. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * Update CHANGELOG for sizing guide PR * Remove Database & Redis rows from sizing table * Revert CHANGELOG entry for docs-only change and update CLAUDE.md --------- Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
1 parent fbf3984 commit e58aff1

File tree

4 files changed

+59
-2
lines changed

4 files changed

+59
-2
lines changed

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,3 +199,4 @@ PR description:
199199
After the PR is created:
200200
- Update CHANGELOG.md with an entry under `[Unreleased]` linking to the new PR. New entries should be placed at the bottom of their section.
201201
- If the change touches `packages/mcp`, update `packages/mcp/CHANGELOG.md` instead
202+
- Do NOT add a CHANGELOG entry for documentation-only changes (e.g., changes only in `docs/`)

docs/docs.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,8 @@
2525
"group": "Deployment",
2626
"pages": [
2727
"docs/deployment/docker-compose",
28-
"docs/deployment/k8s"
28+
"docs/deployment/k8s",
29+
"docs/deployment/sizing-guide"
2930
]
3031
}
3132
]
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
---
2+
title: "Sizing Guide"
3+
---
4+
5+
Sourcebot runs as a single container (vertical scaling). This guide helps you choose the right CPU, memory, and disk allocation based on the number of repositories you plan to index.
6+
7+
<Info>
8+
These recommendations are based on real-world deployments. Your results may vary depending on repository sizes, search patterns, and whether you use features like [multi-branch indexing](/docs/features/search/multi-branch-indexing) or [Ask Sourcebot](/docs/features/ask/overview).
9+
</Info>
10+
11+
## Recommendations
12+
13+
| | Small | Medium | Large | Extra Large |
14+
|---|---|---|---|---|
15+
| **Repos** | Up to 100 | 100 – 500 | 500 – 2,000 | 2,000+ |
16+
| **CPU** | 2 cores | 4 cores | 8 cores | 16+ cores |
17+
| **Memory** | 4 GB | 8 GB | 32 GB | 64+ GB |
18+
| **Disk** | 50 GB | 100 GB | 250 GB | 500+ GB |
19+
20+
We recommend using external managed Postgres and Redis instances rather than the ones embedded in the Sourcebot container, as this adds stability to your deployment. You can configure these with the `DATABASE_URL` and `REDIS_URL` [environment variables](/docs/configuration/environment-variables).
21+
22+
Of all resources, **memory has the most direct impact on search performance**. Sourcebot uses [Zoekt](https://github.com/sourcegraph/zoekt) for search indexing, and the OS page cache keeps frequently accessed index data in memory. More memory means more of the index stays cached, which translates directly to faster searches and less disk I/O.
23+
24+
## Disk usage
25+
26+
Disk is consumed by two things:
27+
28+
1. **Cloned repositories** stored in the `.sourcebot/` cache directory
29+
2. **Zoekt search indexes** built from those repositories
30+
31+
As a rule of thumb, plan for **2 – 3x the total size of the source code** you intend to index. For example, if your repositories total 50 GB, allocate at least 100 – 150 GB of disk.
32+
33+
<Warning>
34+
[Multi-branch indexing](/docs/features/search/multi-branch-indexing) significantly increases disk usage since each indexed branch produces its own search index. In testing, enabling branch indexing across all branches can **triple** storage requirements. Start with a subset of branches (e.g., release branches) and monitor disk usage before expanding.
35+
</Warning>
36+
37+
## Tuning concurrency
38+
39+
If your instance is resource-constrained, you can reduce the concurrency of background jobs to lower CPU and memory pressure during indexing. These are configured in your [config file](/docs/configuration/config-file#settings):
40+
41+
| Setting | Default | Description |
42+
|---|---|---|
43+
| `maxRepoIndexingJobConcurrency` | 8 | Number of repos indexed in parallel |
44+
| `maxConnectionSyncJobConcurrency` | 8 | Number of connections synced in parallel |
45+
46+
Lowering these values reduces peak resource usage at the cost of slower initial indexing.
47+
48+
## Monitoring
49+
50+
We recommend monitoring the following metrics after deployment to validate your sizing:
51+
52+
- **Memory utilization**: sustained usage near the limit suggests you should scale up memory. High memory usage is expected and healthy since the OS page cache will use available memory.
53+
- **CPU utilization**: sustained high CPU during searches (not just during indexing) indicates you may need more cores.
54+
- **Disk usage**: monitor disk consumption as you add repositories. Running out of disk will cause indexing failures.
55+
- **Search response times**: if searches are consistently slow, try increasing memory first, then CPU.

docs/docs/overview.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ You can use managed Redis / Postgres services that run outside of the Sourcebot
185185
## Scalability
186186
---
187187

188-
One of our design philosophies for Sourcebot is to keep our infrastructure [radically simple](https://www.radicalsimpli.city/) while balancing scalability concerns. Depending on the number of repositories you have indexed and the instance you are running Sourcebot on, you may experience slow search times or other performance degradations. Our recommendation is to vertically scale your instance by increasing the number of CPU cores and memory.
188+
One of our design philosophies for Sourcebot is to keep our infrastructure [radically simple](https://www.radicalsimpli.city/) while balancing scalability concerns. Depending on the number of repositories you have indexed and the instance you are running Sourcebot on, you may experience slow search times or other performance degradations. Our recommendation is to vertically scale your instance by increasing the number of CPU cores and memory. See the [sizing guide](/docs/deployment/sizing-guide) for detailed recommendations.
189189

190190
Sourcebot does not support horizontal scaling at this time, but it is on our roadmap. If this is something your team would be interested in, please contact us at [team@sourcebot.dev](mailto:team@sourcebot.dev).
191191

0 commit comments

Comments
 (0)