[SPARK-56489][SQL] Add CURRENT_PATH() builtin expression and keywords#55354
[SPARK-56489][SQL] Add CURRENT_PATH() builtin expression and keywords#55354srielau wants to merge 2 commits intoapache:masterfrom
Conversation
…ated grammar keywords Add the CURRENT_PATH() builtin function that returns the current SQL resolution search path as a comma-separated string of qualified schema names (e.g. "system.builtin,system.session,spark_catalog.default"). Also register the grammar keywords needed by the upcoming SQL PATH feature (CURRENT_PATH, CURRENT_SCHEMA, CURRENT_DATABASE, DEFAULT_PATH, SYSTEM_PATH, PATH). CURRENT_PATH and CURRENT_SCHEMA are reserved in ANSI mode per SQL:2023; the others are non-reserved. In non-ANSI mode, CURRENT_PATH, CURRENT_DATABASE, and CURRENT_SCHEMA always resolve to their respective expressions (not UnresolvedAttribute), matching the behavior of CURRENT_CATALOG. This is part 1 of the SQL PATH feature (SPARK-54810), split out to keep the review scope manageable.
4f162ca to
1e1b838
Compare
dtenedor
left a comment
There was a problem hiding this comment.
Thanks for working on this!
| |DEFAULT|non-reserved|non-reserved|non-reserved| | ||
| |DEFINED|non-reserved|non-reserved|non-reserved| | ||
| |DEFINER|non-reserved|non-reserved|non-reserved| | ||
| |DEFAULT_PATH|non-reserved|non-reserved|not a keyword| |
There was a problem hiding this comment.
From an initial look at the PR, some of the new keywords including this one are defined in the parser as keywords but not used elsewhere yet. Is this intended? I see CURRENT_PATH used in here so far.
Also, just curious why we are implementing CURRENT_PATH() using built-in keywords instead of regular SQL functions in the FunctionRegistry?
There was a problem hiding this comment.
Teh parens are options. It's teh class of "currentLike". And yes these keywords will get relevant when I add teh SET PATh statement which will follow nect.
There was a problem hiding this comment.
Yes, these keywords (DEFAULT_PATH, PATH, SYSTEM_PATH) are forward-declared for the upcoming SET PATH statement (next PR in the series). CURRENT_PATH is used in this PR. As for why CURRENT_PATH uses the currentLike keyword pattern instead of regular FunctionRegistry: it follows SQL:2023 where CURRENT_PATH is a special value like CURRENT_USER — parentheses are optional and it is reserved in ANSI mode.
There was a problem hiding this comment.
Makes sense, thanks for explaining!
|
/spark-dev:review |
cloud-fan
left a comment
There was a problem hiding this comment.
Summary
Prior state and problem. Spark's currentLike grammar rule handles SQL expressions that can appear without parentheses: CURRENT_DATE, CURRENT_TIMESTAMP, CURRENT_TIME, CURRENT_USER, USER, SESSION_USER. Other similar functions — current_database(), current_catalog() — require parentheses because they're resolved through FunctionRegistry as regular function calls, not as parser keywords. SQL:2023 defines CURRENT_PATH as a standard function returning the resolution search path, which Spark doesn't yet support.
Design approach. The PR extends the existing currentLike pattern: adds CurrentPath as an Unevaluable leaf expression (matching CurrentCatalog, CurrentDatabase), registers it in FunctionRegistry, adds it to the currentLike grammar rule, and replaces it with a literal string in the ReplaceCurrentLike optimizer rule. It also promotes CURRENT_DATABASE and CURRENT_SCHEMA from function-only identifiers to parser keywords in the currentLike rule, and adds forward-declared keywords (DEFAULT_PATH, PATH, SYSTEM_PATH) for the upcoming SET PATH command.
Key design decisions.
- In non-ANSI mode,
CURRENT_PATH,CURRENT_DATABASE, andCURRENT_SCHEMAalways resolve to their builtin expressions, whileCURRENT_DATE/CURRENT_TIMESTAMP/CURRENT_TIMEstill produceUnresolvedAttribute. This asymmetry meansCURRENT_DATABASEandCURRENT_SCHEMAcan no longer be used as bare column references in non-ANSI mode — a breaking behavioral change. CURRENT_PATHandCURRENT_SCHEMAare ANSI-reserved per SQL:2023;CURRENT_DATABASEis non-reserved in all modes.- The path string is formatted as
system.builtin,system.session,spark_catalog.default(unquoted dot-separated, comma-delimited) with ordering determined bysessionFunctionResolutionOrder.
Implementation sketch. Four components are changed: (1) CurrentPath expression class (misc.scala) — the value node; (2) FunctionRegistry — enables current_path() with parens; (3) SqlBaseParser.g4 + AstBuilder.visitCurrentLike — enables CURRENT_PATH without parens plus CURRENT_DATABASE/CURRENT_SCHEMA; (4) ReplaceCurrentLike (finishAnalysis.scala) — replaces the expression with a string literal using SQLConf.resolutionSearchPath. All other changes are golden file / test assertions for the new keywords.
Existing review context. @dtenedor asked why the PR defines keywords that aren't used in grammar rules yet, and why CURRENT_PATH() uses keywords instead of regular function resolution. @srielau replied that parens are optional (it's currentLike-class) and the unused keywords will become relevant with the upcoming SET PATH statement.
Missing DataFrame API function. current_catalog(), current_database(), current_schema(), and current_user() all have DataFrame API entries in functions.scala. current_path() does not, so users of the DataFrame/Dataset API cannot call it programmatically. Is this planned for a follow-up?
|
Thanks for the thorough review @cloud-fan! Addressed all comments in the upcoming push:
|
- Remove CURRENT_DATABASE/CURRENT_SCHEMA from currentLike grammar rule to avoid breaking change in non-ANSI mode (they remain available via FunctionRegistry with parentheses) - Use .quoted for proper identifier quoting in CURRENT_PATH() output - Fix CurrentPath doc comment: remove forward reference to PATH feature - Add current_path() to DataFrame API (functions.scala) - Expand test coverage: without-parens ANSI syntax, USE DATABASE context
dtenedor
left a comment
There was a problem hiding this comment.
Thanks for also adding the DataFrame API support!
| |DEFAULT|non-reserved|non-reserved|non-reserved| | ||
| |DEFINED|non-reserved|non-reserved|non-reserved| | ||
| |DEFINER|non-reserved|non-reserved|non-reserved| | ||
| |DEFAULT_PATH|non-reserved|non-reserved|not a keyword| |
There was a problem hiding this comment.
Makes sense, thanks for explaining!
|
LGTM, merging to master. |
|
@srielau Update: It looks like this test is now broken in CI: We should fix it ASAP or revert this PR to unblock CI. |
|
The CI for this PR failed https://github.com/srielau/spark/actions/runs/24513039822/job/71706695434 and it's pretty obvious that it's related We should be extra careful when we merge PRs that have a failed CI. |

What changes were proposed in this pull request?
Add the
CURRENT_PATH()builtin function that returns the current SQL resolution search path as a comma-separated string of qualified schema names (e.g.system.builtin,system.session,spark_catalog.default).Also register the grammar keywords needed by the upcoming SQL PATH feature:
CURRENT_PATH,CURRENT_SCHEMA,CURRENT_DATABASE,DEFAULT_PATH,SYSTEM_PATH,PATH.CURRENT_PATHandCURRENT_SCHEMAare reserved in ANSI mode per SQL:2023; the others are non-reserved.In non-ANSI mode,
CURRENT_PATH,CURRENT_DATABASE, andCURRENT_SCHEMAalways resolve to their respective expressions (notUnresolvedAttribute), matching the behavior ofCURRENT_CATALOG.This is part 1 of the SQL PATH feature (SPARK-54810), split out to keep the review scope manageable.
Why are the changes needed?
CURRENT_PATH()is a SQL-standard function (SQL:2023) that exposes the resolution search path to users. The grammar keywords are prerequisites for theSET PATHcommand and path-based resolution coming in follow-up PRs.Does this PR introduce any user-facing change?
Yes. New builtin function
CURRENT_PATH()and new reserved/non-reserved keywords.How was this patch tested?
FunctionQualificationSuiteverifyingcurrent_path()returns a non-empty qualified path string.keywords.sql.out,keywords-enforced.sql.out,nonansi/keywords.sql.out).sql-expression-schema.mdandSparkConnectDatabaseMetaDataSuitekeyword assertions.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.6