Skip to content

Commit d83451b

Browse files
ijac13claude
andcommitted
docs: Broader introduction with problem/solution/strengths
Restructure the landing page to better communicate: 1. Problem/Value - Data pipelines succeed but data is quietly wrong - Validation knowledge stays in people's heads - Jr/Sr engineers apply different standards 2. Solution - Core: validation engine + visualization - Access via: Cloud, OSS, CLI, Agent, Plugin, MCP 3. What Makes Recce Different - Column-level impact radius: validate affected columns, not entire models - Agent as reviewer zero: surfaces unknown unknowns, users focus on judgment - Collaborate and standardize: checks → preset checks Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 86b631c commit d83451b

2 files changed

Lines changed: 120 additions & 56 deletions

File tree

docs/1-whats-recce/cloud-vs-oss.md

Lines changed: 45 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,13 @@ flowchart LR
2828

2929
| | Cloud | Open Source |
3030
|--|-------|-------------|
31-
| **Experience** | The agent works alongside you | You run validation manually |
32-
| **PR validation** | Agent validates automatically, posts summary | You run checks, copy results to PR |
3331
| **During development** | CLI + Agent assistance | CLI tools only |
32+
| **PR validation** | Agent validates automatically, posts summary | You run checks, copy results to PR |
33+
| **Collaboration** | Preset checks, shared standards, persistent history | Local only |
34+
| **Experience** | The agent works alongside you | You run validation manually |
3435
| **Learning curve** | Agent guides you through validation | Learn the tools, run them yourself |
3536

37+
3638
## Cloud
3739

3840
Recce Cloud connects to your Git repository and data warehouse so the Recce Agent can validate your data changes automatically. When you open a PR, the agent analyzes your changes, runs validation checks, and posts findings directly to your PR. No manual work required.
@@ -90,20 +92,52 @@ You get:
9092

9193
## Feature Comparison
9294

95+
### Validation Engine
96+
97+
Both Cloud and OSS include the same validation engine.
98+
9399
| Feature | Cloud | OSS |
94100
|---------|-------|-----|
95101
| Lineage Diff | Yes | Yes |
96-
| Data diff<br> (row count, schema, profile, value, top-k, histogram diff) | Yes | Yes |
97-
| Query diff | Yes | Yes |
98-
| Checklist | Yes | Yes |
99-
| Agent on PRs | Yes | No |
100-
| Agent CLI assistance (MCP) | Yes | Yes |
101-
| Preset checks across PRs | Yes | Manual |
102-
| Shared validation standards | Yes | Manual |
103-
| Developer-reviewer collaboration | Yes | Manual |
104-
| PR comments & summaries | Yes | No |
102+
| Schema Diff | Yes | Yes |
103+
| Row Count Diff | Yes | Yes |
104+
| Profile Diff | Yes | Yes |
105+
| Value Diff | Yes | Yes |
106+
| Top-K Diff | Yes | Yes |
107+
| Histogram Diff | Yes | Yes |
108+
| Query Diff | Yes | Yes |
109+
| Checklist (local) | Yes | Yes |
110+
111+
### Data Review Agent
112+
113+
The Data Review Agent automatically validates PRs. Cloud only.
114+
115+
| Feature | Cloud | OSS |
116+
|---------|-------|-----|
117+
| Auto-validates when PR opens | Yes | No |
118+
| Posts summary to PR | Yes | No |
119+
| Updates on new commits | Yes | No |
105120
| LLM-powered insights | Yes | No |
106121

122+
### Collaboration
123+
124+
Team features for sharing validation standards. Cloud only.
125+
126+
| Feature | Cloud | OSS |
127+
|---------|-------|-----|
128+
| Preset checks across PRs | Yes | No |
129+
| Shared validation standards | Yes | No |
130+
| Developer-reviewer collaboration | Yes | No |
131+
| Persistent validation history | Yes | No |
132+
133+
### Access Methods
134+
135+
| Feature | Cloud | OSS |
136+
|---------|-------|-----|
137+
| CLI | `recce-cloud` | `recce` |
138+
| Web UI | Yes | Local only |
139+
| MCP (AI agents) | Yes | Yes |
140+
107141
## FAQ
108142

109143
**Can I start with OSS and upgrade to Cloud later?**

docs/index.md

Lines changed: 75 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,110 @@
11
---
2-
title: "Recce: Data Review Agent for dbt Pull Requests"
2+
title: "Recce: Data Validation for dbt Pull Requests"
33
description: >-
4-
Recce automates data validation for dbt pull requests. Compare schema changes,
5-
row counts, and data diffs between environments to catch data quality issues
6-
before they reach production.
4+
Recce helps data teams catch data changes and downstream impacts before production.
5+
Validate with column-level precision, automate with agents, and standardize across your team.
76
---
87

9-
# What is Recce (Data Review Agent)
8+
# What is Recce
109

11-
No more merging PRs where the pipeline succeeded but the data is quietly wrong.
10+
Recce helps data teams catch data changes and their downstream impacts before they reach production.
1211

13-
Recce is a Data Review Agent that automates data validation for pull requests. When you open a PR, it compares your dev environment against production and surfaces schema changes, data diffs, row counts, and downstream impacts. You see what changed, what it affects, and what passed, all before you merge.
12+
**The problem:** Data pipelines succeed but data is quietly wrong. PRs merge without anyone checking what actually changed in the data. Junior and senior engineers apply different standards. Validation knowledge stays in people's heads instead of becoming team practice.
1413

15-
Recce is the product. The agent automates validation on your PRs. You can run Recce through Cloud (hosted, automated) or open source (local, manual).
14+
**The solution:** Recce provides a validation engine plus an AI agent that reviews your PRs automatically. The engine compares environments and visualizes impact. The agent runs validation, surfaces what changed, and explains why it matters—before you even look at the PR.
1615

17-
[**Get Started with Cloud**](2-getting-started/start-free-with-cloud.md){ .md-button .md-button--primary }
18-
[**Set Up Open Source**](2-getting-started/oss-setup.md){ .md-button }
16+
[**Get Started with Cloud**](getting-started/start-free-with-cloud.md){ .md-button .md-button--primary }
17+
[**Set Up Open Source**](getting-started/oss-setup.md){ .md-button }
18+
19+
---
1920

2021
## How Recce Works
2122

22-
When you open a PR with data changes, Recce automatically:
23+
1. **Validation Engine:** Compares base (production) vs. current (development) environments and visualizes differences
24+
2. **Data Review Agent:** Automatically validates PRs, runs data diffs, and posts a summary explaining changes and their impact
25+
26+
**Access via:**
27+
28+
| Method | Description |
29+
|--------|-------------|
30+
| **Cloud** | Full product: Validation engine + Data Review Agent + Collaboration. Includes `recce-cloud` CLI. |
31+
| **OSS** | Validation engine only. No Agent. No collaboration. Includes `recce` CLI. |
32+
| **MCP** | Use Recce OSS via AI agents (Claude Code, Cursor, Windsurf) with natural language. |
33+
34+
![How Recce Works](assets/images/whats-recce/how-recce-work.png)
35+
36+
---
2337

24-
1. **Runs data diffing:** The best practice to validate data changes
25-
2. **Analyzes impact:** Identifies what changed down to the column level using Column-Level Lineage (CLL)
26-
3. **Reviews first:** The agent provides a data review summary explaining the change and its impact
27-
4. **Surfaces what matters:** Shows only impacted items, not every downstream table
28-
5. **Opens exploration:** Spins up a Recce instance where you can run additional diffs, explore lineage, and investigate deeper
38+
## What Makes Recce Different
2939

30-
You review the agent's findings, add notes, and approve with confidence, not blind trust.
40+
### Column-level impact radius
3141

32-
![How Recce Works](assets/images/1-whats-recce/how-recce-work.png)
42+
Validate only the columns affected by your change, not entire models.
3343

34-
1. PR Created
35-
2. Recce Triggered
36-
3. Agent Analyzes Production vs. Development Data
37-
4. Agent Generates Review Summary
38-
5. Human Explore in Recce Instance
39-
6. Human Reviews Approves
40-
7. PR Merges
44+
When you modify a column, Recce traces its downstream dependencies using Column-Level Lineage (CLL). You see exactly which columns in which models are impacted. This means:
4145

46+
- Targeted validation instead of full-table comparisons
47+
- Faster reviews with less noise
48+
- Clear understanding of change propagation
4249

43-
Example of Recce agent summary in a GitHub PR comment:
44-
![How Recce Works](assets/images/1-whats-recce/agent-data-review-example.png)
50+
### Agent as reviewer zero
4551

46-
## Automate Agent Data Review with CI/CD
52+
The agent validates first so you can focus on judgment.
4753

48-
Recce delivers value through CI/CD integration. Without it, you waste time triaging false alerts from source data updates and manually comparing environments hoping you caught everything.
54+
Instead of manually checking what changed, the agent:
4955

50-
With CI/CD:
56+
- Runs data diffs automatically when PRs open
57+
- Surfaces schema changes, row count differences, and data anomalies
58+
- Identifies unknown unknowns you might miss
59+
- Provides a summary explaining what changed and why it matters
5160

52-
- Every PR gets automatic validation
53-
- Base and current environments are set up automatically
54-
- Agent reviews before you do
55-
- Checks accumulate as organizational knowledge (preset checks)
61+
You review the agent's findings and decide what needs attention.
62+
63+
![Agent summary in PR](assets/images/whats-recce/agent-data-review-example.png)
64+
65+
### Collaborate and standardize
66+
67+
Turn individual checks into team standards.
68+
69+
**Checks:** Save validation results to a checklist. Add descriptions explaining what reviewers should verify. Share with your team.
70+
71+
**Preset checks:** Promote recurring checks to run automatically on every PR. New team members apply the same validation standards as senior engineers.
72+
73+
---
5674

5775
## When to Use Recce
5876

59-
- **Business-critical data:** Data that's customer-facing or revenue-impacting
60-
- **Team collaboration:** When reviewers need to understand impact, not just see code changes
61-
- **Standardized validation:** When you need consistent pull request review across senior and junior team members
77+
- **Business-critical data:** Customer-facing or revenue-impacting pipelines where errors cost money
78+
- **Team collaboration:** When reviewers need to understand data impact, not just code changes
79+
- **Consistent standards:** When junior and senior engineers should apply the same validation rigor
6280
- **Unknown unknowns:** When you can't predict what might break from a change
6381

6482
## When Not to Use
6583

66-
- Teams that accept errors on production and fix later
67-
- Exploratory analysis that won't go to production
84+
- Teams that accept production errors and fix later
85+
- Exploratory analysis that won't reach production
86+
87+
---
6888

6989
## FAQ
7090

91+
**What data platforms does Recce support?**
92+
93+
Recce works with Snowflake, BigQuery, Redshift, Databricks, and other dbt-supported warehouses. See [Connect to Warehouse](setup-guides/connect-to-warehouse.md).
94+
7195
**Does Recce work without CI/CD?**
72-
Yes, you can run Recce locally for dev sessions. But CI/CD unlocks the full value: automatic validation on every PR without manual setup.
7396

74-
**What data platforms does Recce support?**
75-
Recce works with data warehouses like Snowflake, BigQuery, Redshift, and Databricks. See [Connect to Warehouse](2-getting-started/connect-to-warehouse.md) for setup.
97+
Yes. Run Recce locally during development or in review sessions. CI/CD unlocks automated validation on every PR.
98+
99+
**What's the difference between Cloud and OSS?**
100+
101+
Cloud provides hosted infrastructure, automated PR integration, and the AI agent. OSS gives you the core validation engine to run yourself. See [Cloud vs OSS](whats-recce/cloud-vs-oss.md).
102+
103+
---
76104

77105
## Next Steps
78-
- Interactive Demo: [Try the Data Review Agent](https://reccehq.com/demo/)
79-
- Tutorial: [Get Started with Recce Cloud](2-getting-started/start-free-with-cloud.md)
80-
- Blog: [The Problem with Data PR Reviews: Where Do You Even Start?](https://blog.reccehq.com/guided-data-review)
106+
107+
- [Interactive Demo](https://reccehq.com/demo/) - Try the Data Review Agent
108+
- [Get Started with Cloud](getting-started/start-free-with-cloud.md) - Automated PR validation
109+
- [OSS Setup](getting-started/oss-setup.md) - Self-hosted validation
110+
- [Blog: The Problem with Data PR Reviews](https://blog.reccehq.com/guided-data-review) - Why data validation matters

0 commit comments

Comments
 (0)