Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,9 @@ Thumbs.db
.idea/
.vscode/
*.swp

# Sample database files (downloaded at runtime by setup scripts)
docker/northwind/northwind.sql

# User-created memory entries (runtime data, not committed)
memory/*.md
33 changes: 33 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Open Data Agent — Project Rules

This project is a **data query tool**, not a typical software project.
When a user asks a question, it is almost always a question **about the data in the database**,
not about the source code.

## CRITICAL — Handling user questions

When the user asks anything that sounds like a data question (e.g. "Which country has the most
customers?", "Show me top products by revenue", "How many orders last month?"):

1. **Do NOT scan the codebase.** This is not a coding task.
2. **Do NOT read `src/`, `tests/`, or any source files.**
3. Read `docs/data-catalog/_index.md` to identify the relevant tables.
4. Read the relevant per-table doc under `docs/data-catalog/` for column names and types.
5. Run `uv run oda query "<SQL>"` to execute the query and return results to the user.

The active database connection and full command reference are in `.opencode/rules/data-agent.md`.
Read that file if you need to know which database is active or how to use `oda` commands.
That file is generated by running `uv run oda connect <name>` from the project root.

## What this project does

`oda` (Open Data Agent) is a local CLI that connects to databases and runs read-only SQL queries.
All queries go through `uv run oda query "..."` — never connect to the database directly.

## When to touch the source code

Only read or edit files in `src/`, `tests/`, or config files when the user explicitly asks to:
- fix a bug
- add a feature
- run tests
- change configuration
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,15 @@ strict_mode: false # if true: block queries when docs are stale
# equivalent to passing --strict on every oda query call
```

### Testing with Sample Data

A ready-to-run guide using the Northwind database is available at [`docs/testing-with-northwind.md`](docs/testing-with-northwind.md). It covers spinning up a local PostgreSQL container, registering the connection, generating schema docs, and running sample queries that exercise every major `oda` feature.

```bash
# One command to get started:
bash scripts/setup-northwind.sh
```

### Development

```bash
Expand Down
20 changes: 20 additions & 0 deletions docker-compose-northwind.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
services:
northwind:
image: postgres:16-alpine
environment:
POSTGRES_DB: northwind
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
volumes:
- ./docker/northwind/northwind.sql:/docker-entrypoint-initdb.d/northwind.sql:ro
- northwind_data:/var/lib/postgresql/data
ports:
- "5433:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres -d northwind"]
interval: 5s
timeout: 5s
retries: 10

volumes:
northwind_data:
Binary file added docs/opencode-northwind-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
314 changes: 314 additions & 0 deletions docs/testing-with-northwind.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,314 @@
# Testing oda with the Northwind Sample Database

This guide walks you through spinning up the Northwind PostgreSQL database with Docker and using it to test every major `oda` feature end-to-end.

## About Northwind

Northwind is a classic sample database originally created by Microsoft. It models a fictional food trading company with 14 tables covering customers, orders, products, employees, suppliers, shippers, and more — rich enough to write interesting queries across multiple joined tables.

**Tables:** `categories`, `customers`, `employees`, `employee_territories`, `order_details`, `orders`, `products`, `region`, `shippers`, `suppliers`, `territories`, `us_states`, `customer_customer_demo`, `customer_demographics`

---

## Step 1 — Start the Northwind container

Run the setup script from the project root. It downloads `northwind.sql` (if not already present) and starts a PostgreSQL container with the database pre-loaded:

```bash
bash scripts/setup-northwind.sh
```

The script will print connection details and confirm the container is healthy before exiting. The database is exposed on **port 5433** (to avoid clashing with the test container on port 5432).

To stop the container later:

```bash
docker compose -f docker-compose-northwind.yml down # stop, keep data
docker compose -f docker-compose-northwind.yml down -v # stop, wipe data
```

---

## Step 2 — Initialise oda

If you haven't run `oda init` yet:

```bash
uv run oda init
```

---

## Step 3 — Add the Northwind connection

```bash
uv run oda connections add
```

Enter the following when prompted:

| Field | Value |
|----------|---------------|
| Name | `northwind` |
| Dialect | `postgresql` |
| Host | `localhost` |
| Port | `5433` |
| Database | `northwind` |
| User | `postgres` |
| Password | `postgres` |

Verify the connection is live:

```bash
uv run oda connections test northwind
```

---

## Step 4 — Activate the connection

```bash
uv run oda connect northwind
```

This sets `northwind` as the active connection and writes `.opencode/rules/data-agent.md` for OpenCode.

---

## Step 5 — Explore the schema

List all schemas:

```bash
uv run oda schemas
```

List all tables:

```bash
uv run oda tables
```

Inspect a specific table:

```bash
uv run oda describe orders
uv run oda describe order_details
uv run oda describe customers
```

Preview sample rows:

```bash
uv run oda sample orders
uv run oda sample products --n 10
```

Profile a table (null counts, distinct values, min/max):

```bash
uv run oda profile orders
uv run oda profile products
```

---

## Step 6 — Generate schema docs

```bash
uv run oda docs generate
```

This writes a markdown file per table under `docs/data-catalog/`. Add `--enrich` to include column statistics:

```bash
uv run oda docs generate --enrich
```

Check doc freshness at any time:

```bash
uv run oda docs status
```

---

## Step 7 — Run sample queries

### Basic selects

```bash
uv run oda query "SELECT company_name, country FROM customers ORDER BY country LIMIT 10"

uv run oda query "SELECT product_name, unit_price, units_in_stock FROM products ORDER BY unit_price DESC LIMIT 10"

uv run oda query "SELECT first_name, last_name, title FROM employees ORDER BY last_name"
```

### Aggregations

```bash
uv run oda query "SELECT country, COUNT(*) AS customer_count FROM customers GROUP BY country ORDER BY customer_count DESC"

uv run oda query "SELECT category_id, COUNT(*) AS product_count, AVG(unit_price) AS avg_price FROM products GROUP BY category_id ORDER BY category_id"
```

### Joins

```bash
uv run oda query "
SELECT o.order_id, c.company_name, o.order_date, o.freight
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
ORDER BY o.order_date DESC
LIMIT 10
"
```

```bash
uv run oda query "
SELECT p.product_name, c.category_name, p.unit_price
FROM products p
JOIN categories c ON p.category_id = c.category_id
ORDER BY c.category_name, p.product_name
LIMIT 20
"
```

### Revenue analysis

```bash
uv run oda query "
SELECT
c.company_name,
COUNT(DISTINCT o.order_id) AS order_count,
ROUND(SUM(od.unit_price * od.quantity * (1 - od.discount))::numeric, 2) AS total_revenue
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_details od ON o.order_id = od.order_id
GROUP BY c.company_name
ORDER BY total_revenue DESC
LIMIT 10
"
```

### Employees and territories

```bash
uv run oda query "
SELECT e.first_name, e.last_name, t.territory_description, r.region_description
FROM employees e
JOIN employee_territories et ON e.employee_id = et.employee_id
JOIN territories t ON et.territory_id = t.territory_id
JOIN region r ON t.region_id = r.region_id
ORDER BY e.last_name
"
```

### Output formats

```bash
# JSON output
uv run oda query "SELECT * FROM shippers" --format json

# CSV output
uv run oda query "SELECT product_name, unit_price FROM products ORDER BY unit_price DESC LIMIT 20" --format csv
```

---

## Step 8 — Test safety enforcement

`oda` hard-blocks all write operations. These should all fail with a `SafetyError`:

```bash
# Write operations are blocked
uv run oda query "INSERT INTO customers (customer_id, company_name) VALUES ('TEST', 'Test Co')"
uv run oda query "UPDATE products SET unit_price = 0 WHERE product_id = 1"
uv run oda query "DELETE FROM orders WHERE order_id = 10248"
uv run oda query "DROP TABLE customers"
```

---

## Step 9 — Test query history

After running a few queries, inspect the history:

```bash
uv run oda history list
uv run oda history list --n 5
uv run oda history search "customers"
uv run oda history stats
```

---

## Step 10 — Add a memory entry

Record a data quirk about the Northwind schema for future sessions:

```bash
uv run oda memory add \
--title "Order revenue calculation" \
--category data_quality \
--content "Use unit_price * quantity * (1 - discount) from order_details for accurate revenue. The discount column is a float between 0 and 1."
```

Verify it was saved:

```bash
uv run oda memory list
uv run oda memory search "revenue"
```

---

## Step 11 — Use with OpenCode (natural language queries)

With the connection active and docs generated, open OpenCode from the project root:

```bash
opencode
```

Then ask questions in plain English:

```
Which country has the most customers?
```

```
Show me the top 5 products by total revenue across all orders.
```

```
Which employees have the most orders assigned to them?
```

```
List all customers from Germany with their total order count.
```

OpenCode reads `docs/data-catalog/` for schema context, constructs the SQL, and calls `oda query` to execute it. If a query returns zero rows, the diagnostic output is fed back to OpenCode automatically so it can self-correct and retry.

### Example

The screenshot below shows OpenCode answering "Show me the top 5 products by total revenue across all orders." — it reads the schema catalog, constructs the correct three-table join with discount calculation, executes it via `uv run oda query`, and presents the ranked results:

![OpenCode answering a natural language revenue question against the Northwind database](opencode-northwind-example.png)

---

## Teardown

```bash
# Stop the container, keep the data volume
docker compose -f docker-compose-northwind.yml down

# Stop and wipe all data (forces fresh reload of northwind.sql on next start)
docker compose -f docker-compose-northwind.yml down -v

# Remove the downloaded SQL file
rm docker/northwind/northwind.sql
```
Loading