Skip to content

feat(codegen): parse DDL with sqlparser, drop hand-rolled scanner (#38)#111

Merged
hyperpolymath merged 2 commits into
mainfrom
feat/vsm38-sqlparser
May 18, 2026
Merged

feat(codegen): parse DDL with sqlparser, drop hand-rolled scanner (#38)#111
hyperpolymath merged 2 commits into
mainfrom
feat/vsm38-sqlparser

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Resolves #38 (V-L2-A1) — replace the hand-rolled SQL DDL scanner with sqlparser.

What changed

  • sqlparser = "0.50" dependency added.
  • src/codegen/parser.rs::parse_sql_schema now parses with sqlparser, trying PostgreSQL → SQLite → generic dialects (covers SERIAL, AUTOINCREMENT, generated columns), and walks Statement::CreateTable into the IR.
  • Public IR (ParsedSchema / TableDef / ColumnDef) and the parse_sql_schema / parse_schema_file signatures are unchanged — overlay and query codegen consume the same shape, no ripple.
  • Deleted split_respecting_parens, parse_create_table, parse_column_def.

Defects fixed (all were silent corruption before)

  • Schema-qualified names (analytics.events) → bare table identifier.
  • Quoted identifiers containing whitespace ("audit log", "user name").
  • Column- and table-level CHECK (...) — ignored cleanly instead of comma-split into bogus columns.
  • GENERATED ALWAYS AS (...) STORED columns.
  • Semicolons inside -- comments no longer break statement boundaries.
  • Invalid SQL is now a hard error, not a silent empty schema.

Acceptance

  • sqlparser dependency added
  • Existing tests pass (all 6 original parser tests retained, unchanged)
  • New tests: schema-qualified, quoted-with-spaces, CHECK, GENERATED (+ semicolon-in-comment, invalid-SQL)

Suite: 113 lib + 9 integration green. The single red is the pre-existing failing-by-design provenance_fork_test (#104, fixed by open PR #109) — on main, unrelated to this branch.

🤖 Generated with Claude Code

Replace src/codegen/parser.rs's uppercase-and-split scanner with the
sqlparser crate (0.50). parse_sql_schema now walks
Statement::CreateTable, trying PostgreSQL → SQLite → generic dialects.
Public IR (ParsedSchema/TableDef/ColumnDef) and the parse_sql_schema /
parse_schema_file signatures are unchanged, so overlay/query consumers
are unaffected. split_respecting_parens, parse_create_table and
parse_column_def are deleted.

Fixes the documented misclassifications: schema-qualified names
(bare table identifier extracted), quoted identifiers with whitespace,
column- and table-level CHECK clauses (ignored, not split into bogus
columns), GENERATED columns, and semicolons inside comments. Invalid
SQL is now a hard error instead of a silent empty schema.

6 original parser tests retained + 6 new acceptance tests
(schema-qualified, quoted-whitespace, CHECK, GENERATED,
semicolon-in-comment, invalid-SQL). Suite: 113 lib + 9 integration
green.

Closes #38.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 20 issues detected

Severity Count
🔴 Critical 1
🟠 High 8
🟡 Medium 11

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Required file missing",
    "type": "missing",
    "file": "SECURITY.md",
    "action": "create",
    "rule_module": "root_hygiene",
    "severity": "high"
  },
  {
    "reason": "Issue in quality.yml",
    "type": "missing_workflow",
    "file": "quality.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in security-policy.yml",
    "type": "missing_workflow",
    "file": "security-policy.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action actions/checkout@v4 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action Swatinem/rust-cache@v2 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action actions/checkout@v4 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action dtolnay/rust-toolchain@master needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action Swatinem/rust-cache@v2 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Required file missing (condition: public_repo)",
    "type": "missing_requirement",
    "file": "SECURITY.md",
    "action": "create",
    "rule_module": "cicd_rules",
    "severity": "high"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath merged commit 038ae49 into main May 18, 2026
11 of 12 checks passed
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 20 issues detected

Severity Count
🔴 Critical 1
🟠 High 8
🟡 Medium 11

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Required file missing",
    "type": "missing",
    "file": "SECURITY.md",
    "action": "create",
    "rule_module": "root_hygiene",
    "severity": "high"
  },
  {
    "reason": "Issue in quality.yml",
    "type": "missing_workflow",
    "file": "quality.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Issue in security-policy.yml",
    "type": "missing_workflow",
    "file": "security-policy.yml",
    "action": "create",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action actions/checkout@v4 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action Swatinem/rust-cache@v2 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action actions/checkout@v4 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Action dtolnay/rust-toolchain@master needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action Swatinem/rust-cache@v2 needs attention",
    "type": "unpinned_action",
    "file": "rust-ci.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Required file missing (condition: public_repo)",
    "type": "missing_requirement",
    "file": "SECURITY.md",
    "action": "create",
    "rule_module": "cicd_rules",
    "severity": "high"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

V-L2-A1: replace hand-rolled SQL DDL parser with sqlparser crate

1 participant