feat(snowflake)!: Transpilation support for TRY_CAST#7355
feat(snowflake)!: Transpilation support for TRY_CAST#7355fivetran-ashashankar wants to merge 1 commit intomainfrom
Conversation
Benchmark ResultsLegend: 🟢🟢 = 5%+ faster | 🟢 = 3-5% faster | 🟩 = 1-3% faster | ⚪ = unchanged | 🟧 = 1-3% slower | 🔴 = 3-5% slower | 🔴🔴 = 5%+ slower sqlglot
sqlglot[c]
Comment |
0171043 to
a39f32d
Compare
SQLGlot Integration Test ResultsComparing:
By Dialect
Overallmain: 4004 total, 4003 passed (pass rate: 100.0%), sqlglot version: sqlglot:RD-1069322-transpile_try_cast: 4004 total, 4003 passed (pass rate: 100.0%), sqlglot version: Transitions: |
| # Common date format patterns for TRY_CAST transpilation to TRY_STRPTIME | ||
| # Maps compiled regex patterns to DuckDB strptime format strings | ||
| _DATE_FORMAT_PATTERNS = [ | ||
| (re.compile(r"^\d{2}-[A-Za-z]{3}-\d{4}$"), "%d-%b-%Y"), # DD-Mon-YYYY | ||
| (re.compile(r"^[A-Za-z]{3}-\d{2}-\d{4}$"), "%b-%d-%Y"), # Mon-DD-YYYY | ||
| (re.compile(r"^\d{1,2}/\d{1,2}/\d{4}$"), "%m/%d/%Y"), # MM/DD/YYYY | ||
| (re.compile(r"^[A-Za-z]+\s+\d{1,2},\s+\d{4}$"), "%B %d, %Y"), # Month DD, YYYY | ||
| ] |
There was a problem hiding this comment.
Why are we doing this? I don't think this is an established pattern. Can you please elaborate on the difficulties you encountered while trying to transpile this function?
There was a problem hiding this comment.
on duckdb - SELECT TRY_CAST('05-Mar-2016' AS DATE)
on snowflake - TRY_CAST('05-Mar-2016' AS DATE) → returns DATE type (2016-03-05)
For DATE target
SELECT CAST(TRY_STRPTIME('05-Mar-2016', '%d-%b-%Y') AS DATE)
There was a problem hiding this comment.
so my thought was - DuckDB's TRY_STRPTIME needs to know the format string for parsing:
-- Different formats need different format strings
TRY_STRPTIME('05-Mar-2016', '%d-%b-%Y') -- DD-Mon-YYYY
TRY_STRPTIME('Mar-05-2016', '%b-%d-%Y') -- Mon-DD-YYYY
TRY_STRPTIME('03/05/2016', '%m/%d/%Y') -- MM/DD/YYYY
TRY_STRPTIME('March 5, 2016', '%B %d, %Y') -- Month DD, YYYY
_DATE_FORMAT_PATTERNS maps each format to its format string.
The pattern matching detects which format the literal string is using, so you know which format string to pass to TRY_STRPTIME.
There was a problem hiding this comment.
There's an existing pattern in the codebase for this, though: calling build_formatted_time. Can't it be applied in this case as well? Doing manual regex matching doesn't seem like the right approach.
|
I will close this for now, it's complicated and the solution isn't trivial. |
Transpilation support for try_cast snowflake function