Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ sqlrite> DELETE FROM users WHERE age < 30;
| `CREATE TABLE` | `PRIMARY KEY`, `UNIQUE`, `NOT NULL`; `IF NOT EXISTS` (idempotent re-create); duplicate-column detection; types `INTEGER`/`INT`/`BIGINT`/`SMALLINT`, `TEXT`/`VARCHAR`, `REAL`/`FLOAT`/`DOUBLE`/`DECIMAL`, `BOOLEAN`. Auto-creates `sqlrite_autoindex_<table>_<col>` for every PK + UNIQUE column |
| `CREATE [UNIQUE] INDEX` | Single-column, named indexes; `IF NOT EXISTS`; persists as a dedicated cell-based B-Tree. INTEGER + TEXT columns only |
| `INSERT INTO` | Explicit column list required; auto-ROWID for `INTEGER PRIMARY KEY`; multi-row `VALUES (…), (…)`; UNIQUE enforcement; clean type errors (no panics); NULL padding for omitted columns |
| `SELECT` | `*` or column list with optional `AS alias`; `WHERE`; `DISTINCT`; `GROUP BY col[, col …]`; aggregate projections `COUNT(*)` / `COUNT([DISTINCT] col)` / `SUM` / `AVG` / `MIN` / `MAX`; `[INNER\|LEFT OUTER\|RIGHT OUTER\|FULL OUTER] JOIN ... ON ...` with table aliases and qualified `t.col` references; single-column `ORDER BY [ASC\|DESC]` (also resolves alias and aggregate display names); `LIMIT n`. `WHERE col = literal` probes an index when one exists. Catalog introspection via `SELECT … FROM sqlrite_master` |
| `SELECT` | `*` or column list with optional `AS alias`; `WHERE`; `DISTINCT`; `GROUP BY col[, col …]`; aggregate projections `COUNT(*)` / `COUNT([DISTINCT] col)` / `SUM` / `AVG` / `MIN` / `MAX`; `[INNER\|LEFT OUTER\|RIGHT OUTER\|FULL OUTER\|CROSS] JOIN` with `ON ...` / `USING (...)` / `NATURAL` constraints, table aliases and qualified `t.col` references; single-column `ORDER BY [ASC\|DESC]` (also resolves alias and aggregate display names); `LIMIT n`. `WHERE col = literal` probes an index when one exists. Catalog introspection via `SELECT … FROM sqlrite_master` |
| `UPDATE` | Multi-column `SET`; `WHERE`; UNIQUE + type enforcement; arithmetic in assignments (`SET age = age + 1`) |
| `DELETE` | `WHERE` predicate or full-table delete |
| `BEGIN` / `COMMIT` / `ROLLBACK` | Real transactions, snapshot-based; WAL-backed commit; single-level (no savepoints); auto-rollback if `COMMIT`'s disk write fails |
Expand All @@ -193,7 +193,7 @@ Expressions in `WHERE` and `UPDATE`'s `SET` RHS:
- String concat — `||`
- Literals — integer + real numbers, `'single-quoted strings'`, `TRUE` / `FALSE`, `NULL`; parentheses for grouping

**Not yet supported** (common ones): subqueries, CTEs, `HAVING`, `LIKE … ESCAPE '<char>'`, `IN (subquery)`, `DISTINCT` on `SUM`/`AVG`/`MIN`/`MAX`, GROUP BY on expressions, expressions in the projection list, `OFFSET`, multi-column `ORDER BY`, savepoints, `JOIN ... USING`, `NATURAL JOIN`, `CROSS JOIN`, comma joins, aggregates / DISTINCT / GROUP BY *over* JOIN results. The [full list with context](docs/supported-sql.md#not-yet-supported) lives in the reference.
**Not yet supported** (common ones): subqueries, CTEs, `HAVING`, `LIKE … ESCAPE '<char>'`, `IN (subquery)`, `DISTINCT` on `SUM`/`AVG`/`MIN`/`MAX`, GROUP BY on expressions, expressions in the projection list, `OFFSET`, multi-column `ORDER BY`, savepoints, comma joins (`FROM a, b`), aggregates / DISTINCT / GROUP BY *over* JOIN results. The [full list with context](docs/supported-sql.md#not-yet-supported) lives in the reference.

#### Meta commands

Expand Down Expand Up @@ -250,7 +250,7 @@ The project is staged in phases, each independently shippable. A finished phase
- [x] `CREATE TABLE` with `PRIMARY KEY`, `UNIQUE`, `NOT NULL`; duplicate-column detection; in-memory `BTreeMap` indexes on PK/UNIQUE columns
- [x] `INSERT` with auto-ROWID for `INTEGER PRIMARY KEY`, UNIQUE enforcement, NULL padding for missing columns
- [x] `SELECT` — projection, `WHERE`, `ORDER BY`, `LIMIT`
- [x] `JOIN` — `INNER`, `LEFT OUTER`, `RIGHT OUTER`, `FULL OUTER` with `ON` (SQLR-5)
- [x] `JOIN` — `INNER`, `LEFT OUTER`, `RIGHT OUTER`, `FULL OUTER`, `CROSS` with `ON` / `USING (...)` / `NATURAL` (SQLR-5)
- [x] `UPDATE ... SET ... WHERE ...` with type + UNIQUE enforcement at write time
- [x] `DELETE ... WHERE ...`
- [x] Expression evaluator: `=`/`<>`/`<`/`<=`/`>`/`>=`, `AND`/`OR`/`NOT`, arithmetic `+`/`-`/`*`/`/`/`%`, string concat `||`, NULL-as-false in `WHERE`
Expand Down Expand Up @@ -332,7 +332,6 @@ Lockstep versioning — one dispatch bumps every product to the same `vX.Y.Z`. T
- [ ] *(deferred to Phase 8)* Full-text search with BM25 + hybrid retrieval

**Possible extras** *(no committed phase)*
- Joins (`INNER`, `LEFT OUTER`, `CROSS` — SQLite does not support `RIGHT`/`FULL OUTER`)
- `HAVING`, `IN (subquery)`, `BETWEEN`, `GLOB` / `REGEXP`, `GROUP_CONCAT`, window functions
- Composite and expression indexes (with cost analysis)
- Alternate storage engines — LSM/SSTable for write-heavy workloads alongside the B-Tree
Expand Down
2 changes: 1 addition & 1 deletion docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -561,7 +561,7 @@ The biggest single SQL-surface jump in the project's history.
- Self-joins require an alias on at least one side.
- `WHERE` runs after joins (the standard `LEFT JOIN ... WHERE right.col IS NULL` anti-join idiom works).

Not yet supported: `CROSS JOIN`, comma-separated FROMs, `NATURAL JOIN`, `JOIN ... USING (col)`, aggregates / `GROUP BY` / `DISTINCT` *over* a join, `fts_match` / `bm25_score` inside a join expression. Algorithm: plain nested-loop, O(N×M) per level — hash / merge joins are a future optimization.
`ON`, `USING (...)`, `NATURAL`, and `CROSS JOIN` are all supported. Not yet supported: comma-separated FROMs (`FROM a, b`), aggregates / `GROUP BY` / `DISTINCT` *over* a join, `fts_match` / `bm25_score` inside a join expression. Algorithm: plain nested-loop, O(N×M) per level — hash / merge joins are a future optimization.

### ✅ Phase 9g — Prepared statements + parameter binding *(v0.9.0, SQLR-23)*

Expand Down
2 changes: 1 addition & 1 deletion docs/sql-engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The `sqlparser` AST is designed to cover every SQL dialect, so its types are hug

`SelectQuery::joins` (SQLR-5) is a `Vec<JoinClause>` evaluated left-to-right by `execute_select_rows_joined`. Each clause carries a `JoinType` (`Inner` / `LeftOuter` / `RightOuter` / `FullOuter`), the right-table name + optional alias, and a required `ON` expression. Empty = single-table SELECT, the existing fast path with HNSW / FTS / bounded-heap optimizations.

Each parser module still rejects features we don't implement with `SQLRiteError::NotImplemented` — `JOIN ... USING`, `NATURAL JOIN`, `CROSS JOIN`, comma joins, aggregates / GROUP BY / DISTINCT over JOINs, `HAVING`, `DISTINCT ON (...)`, `GROUP BY` on expressions, `LIKE … ESCAPE '<char>'`, `IN (subquery)`, `OFFSET`, multi-table DELETE, tuple assignment targets, etc. These errors carry the feature name in the message so the user knows what isn't there.
Each parser module still rejects features we don't implement with `SQLRiteError::NotImplemented` — comma joins (`FROM a, b`), aggregates / GROUP BY / DISTINCT over JOINs, `HAVING`, `DISTINCT ON (...)`, `GROUP BY` on expressions, `LIKE … ESCAPE '<char>'`, `IN (subquery)`, `OFFSET`, multi-table DELETE, tuple assignment targets, etc. These errors carry the feature name in the message so the user knows what isn't there. (`JOIN ... USING`, `NATURAL JOIN`, and `CROSS JOIN` are now supported — see [`supported-sql.md`](supported-sql.md#join-semantics-sqlr-5).)

## Statement dispatch

Expand Down
21 changes: 14 additions & 7 deletions docs/supported-sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,19 +210,27 @@ COUNT([DISTINCT] <column>) -- counts non-NULL values, option

### `JOIN` semantics (SQLR-5)

Four flavors are supported, all with explicit `ON` conditions:
Four flavors are supported, with `ON`, `USING (...)`, or `NATURAL` match
conditions, plus `CROSS JOIN`:

| Flavor | Keeps unmatched rows from… |
|---|---|
| `INNER JOIN` | …neither side. Only ON-matched pairs survive. |
| `INNER JOIN` | …neither side. Only matched pairs survive. |
| `LEFT [OUTER] JOIN` | …the left side; right-side columns become `NULL` for unmatched left rows. |
| `RIGHT [OUTER] JOIN` | …the right side; left-side columns become `NULL` for unmatched right rows. |
| `FULL [OUTER] JOIN` | …both sides, NULL-padded on the unmatched side. |
| `CROSS JOIN` | …both sides (cross product — every left row paired with every right row). |

- **Engine choice:** SQLite ships only `INNER` and `LEFT OUTER`. SQLRite implements all four because the per-flavor differences boil down to NULL-padding policy on top of one shared nested-loop driver — adding `RIGHT` / `FULL` was effectively free once the executor had a multi-table scope. See [`docs/design-decisions.md`](design-decisions.md) for the rationale.
- **Match conditions:**
- **`ON <expr>`** — any boolean expression over the in-scope tables.
- **`USING (col[, col…])`** — shorthand for `left.col = right.col` AND-chained over each named column. The column must exist on the right side and on some left-side table; in a chain (`A JOIN B USING(x) JOIN C USING(x)`) each `x` resolves against the first left table that has it.
- **`NATURAL`** — equivalent to `USING (<every column name the two sides share>)`, discovered automatically from the schemas. If the sides share no column names, a `NATURAL JOIN` degrades to a cross product (matching SQLite). Combines with a flavor: `NATURAL LEFT JOIN`.
- **`CROSS JOIN`** — the cross product; the engine treats it as `INNER JOIN ... ON true`.
- **`SELECT *` with `USING` / `NATURAL`:** each joined-on column appears **once** (SQLite convention), taking the left side's value; the right side's duplicate is omitted. Plain `ON` joins keep both copies.
- **Aliases:** `FROM customers AS c INNER JOIN orders AS o ON c.id = o.customer_id`. When an alias is supplied the original table name leaves scope (SQL standard) — qualifier resolution uses the alias.
- **Qualified column references:** `<table>.<col>` and `<alias>.<col>` resolve to that specific side. Bare `<col>` references must resolve to exactly one in-scope table; ambiguous references error with a "qualify it as `<table>.col`" hint.
- **Output of `SELECT *`** over a join is every column of every in-scope table, in source order. Duplicate header names are permitted (SQLite-style). Disambiguate with explicit `SELECT t.col AS t_col, u.col AS u_col`.
- **Output of `SELECT *`** over a join is every column of every in-scope table, in source order (minus `USING` / `NATURAL` duplicates, see above). Duplicate header names are otherwise permitted (SQLite-style). Disambiguate with explicit `SELECT t.col AS t_col, u.col AS u_col`.
- **Multi-join** chains left-fold: `A JOIN B ON ... JOIN C ON ...` evaluates as `(A ⨝ B) ⨝ C`. Each new clause sees every prior alias / table in its `ON` expression.
- **Self-joins** require an alias on at least one side: `FROM nodes AS p INNER JOIN nodes AS c ON p.id = c.parent_id`. Without one, you get a `duplicate table reference` error so qualifiers stay unambiguous.
- **`WHERE` runs after joins.** A `WHERE right.col IS NULL` filter on a `LEFT JOIN` correctly returns left rows with no match (the standard "anti-join via outer-join" idiom).
Expand All @@ -231,8 +239,7 @@ Four flavors are supported, all with explicit `ON` conditions:

#### What's not supported in JOINs

- `JOIN ... USING (col)` and `NATURAL JOIN` — explicit `ON` only. (Both are deferred — `USING` is straightforward but adds a column-resolution rule we haven't needed yet.)
- `CROSS JOIN` (write `INNER JOIN ... ON true` instead) and comma-separated FROM lists.
- Comma-separated FROM lists (`FROM a, b`) — use an explicit `JOIN` / `CROSS JOIN` instead.
- Aggregates / `GROUP BY` / `DISTINCT` *over* a join. The single-table aggregator is wired against one rowid stream; rewiring it for joined rows is a separate increment. Surfaces as a clean `NotImplemented` at parse time.
- `fts_match` / `bm25_score` inside a JOIN expression. They need to look up an FTS index by column, which is single-table-bound today. Use them on a single-table SELECT first, or fold the FTS lookup into the FROM side.

Expand All @@ -255,7 +262,7 @@ The executor includes a tiny optimizer: if the `WHERE` is exactly `<indexed_col>

### What doesn't work

- **`CROSS JOIN`**, **comma-separated FROM lists**, **`NATURAL JOIN`**, **`JOIN ... USING (col)`** — explicit `INNER` / `LEFT` / `RIGHT` / `FULL OUTER JOIN ... ON ...` only (see [JOIN semantics](#join-semantics-sqlr-5))
- **Comma-separated FROM lists** (`FROM a, b`) — use an explicit `JOIN` / `CROSS JOIN`. `INNER` / `LEFT` / `RIGHT` / `FULL OUTER` / `CROSS` with `ON` / `USING` / `NATURAL` are all supported (see [JOIN semantics](#join-semantics-sqlr-5))
- **Aggregates** / **`GROUP BY`** / **`DISTINCT`** over a JOIN — pipe through a subquery once subqueries land
- **Subqueries**, CTEs (`WITH`), views
- **`HAVING`** — pre-aggregation `WHERE` works; post-aggregation filtering does not yet
Expand Down Expand Up @@ -700,7 +707,7 @@ A REPL launched with `sqlrite --readonly foo.sqlrite` (or `sqlrite::open_databas
For context when you hit `NotImplemented`. See [Roadmap](roadmap.md) for when these land:

### Joins & composition
- `CROSS JOIN`, comma joins, `NATURAL JOIN`, `JOIN ... USING` — explicit `INNER` / `LEFT` / `RIGHT` / `FULL OUTER JOIN ... ON ...` works (SQLR-5); the others don't
- `INNER` / `LEFT` / `RIGHT` / `FULL OUTER` / `CROSS JOIN` with `ON` / `USING (...)` / `NATURAL` all work (SQLR-5). Comma-separated FROM joins (`FROM a, b`) don't — use an explicit `JOIN` / `CROSS JOIN`
- Aggregates / `GROUP BY` / `DISTINCT` *over* a JOIN — pipe through a subquery once subqueries land
- `fts_match` / `bm25_score` inside a JOIN expression — single-table-bound today
- Subqueries (scalar, `IN (SELECT ...)`, correlated)
Expand Down
Loading
Loading