Skip to content

Add CSV output format for SQL query results#4728

Open
simonfaltum wants to merge 4 commits intomainfrom
simonfaltum/sql-csv-output
Open

Add CSV output format for SQL query results#4728
simonfaltum wants to merge 4 commits intomainfrom
simonfaltum/sql-csv-output

Conversation

@simonfaltum
Copy link
Member

@simonfaltum simonfaltum commented Mar 12, 2026

Why

The SQL query command supports JSON and table output but not CSV. CSV is the most common format for data export and piping into tools like Excel, pandas, and database imports.

Changes

Before: databricks sql query only supports JSON and table output formats.
Now: A --format csv flag writes results as RFC 4180 CSV with column headers as the first row.

Uses Go's encoding/csv package for proper escaping and quoting. The flag bypasses the normal output mode selection (text/json), so it works regardless of terminal interactivity.

Test plan

  • Unit tests for CSV rendering (basic, special characters, empty results, short rows)
  • Full aitools test suite passes
  • make checks passes
  • Binary builds successfully

Add a --format csv flag to the query command for exporting results as CSV.
Uses Go's encoding/csv for proper escaping and quoting. Column headers
are included as the first row.

Co-authored-by: Isaac
@simonfaltum simonfaltum marked this pull request as ready for review March 13, 2026 10:33
@simonfaltum simonfaltum requested review from a team and lennartkats-db as code owners March 13, 2026 10:33
Copy link
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was posted by Claude (AI assistant). Shreyas will do a separate, more thorough review pass.

Priority: MEDIUM — Dual-flag design concern

MEDIUM: --format vs --output dual-flag confusion

The PR introduces a --format flag for SQL statement execution that is separate from the existing --output flag used everywhere else in the CLI. This creates two different ways to control output format, which may confuse users. Consider whether --output csv could be extended to handle this case instead, or at minimum document the distinction clearly.

Other Observations

  • CSV output implementation is clean and correct
  • Good handling of SQL result pagination
  • Proper escaping of CSV values with special characters
  • Missing test for interaction between --format and --output flags when both are set

The main thing to discuss is whether this warrants a new flag or should extend the existing --output flag.

@simonfaltum
Copy link
Member Author

Re: --format vs --output flag concern:

We intentionally use a separate --format flag here rather than extending --output. The global --output flag is a PersistentFlag on the root command with a hard-coded Set() validator that only accepts json and text. There is no mechanism in Cobra/pflag to extend a parent's persistent flag with additional values on a per-command basis.

Adding csv to the global --output would require changes to libs/flags/output.go, the cmdio render pipeline, and every command would need to handle or reject csv. That is a large, invasive change for a feature that only applies to the SQL query command.

The PR already includes a mutual-exclusion guard that rejects using both flags together with a clear error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants