Skip to content

[SIG-94234] fix tokenizing <-#45

Closed
ayman-sigma wants to merge 1 commit intomainfrom
ayman/lt-negative
Closed

[SIG-94234] fix tokenizing <-#45
ayman-sigma wants to merge 1 commit intomainfrom
ayman/lt-negative

Conversation

@ayman-sigma
Copy link

@ayman-sigma ayman-sigma commented Mar 16, 2026

In dialects that support geometric types (e.g. Redshift), the tokenizer treated < followed by - too eagerly causing it to consume the wrong operator. For example when tokenizing something like <-4000. I don't think there is an operator for <- and the correct tokenizing there should be LT MINUS 4000.

Should I create this fix in upstream first?

@ayman-sigma ayman-sigma requested a review from jmhain March 16, 2026 19:36
Comment on lines +4177 to +4178
let tokens = Tokenizer::new(&dialect, "SELECT a <-> b")
.tokenize()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use all_dialects_where(|d| d.supports_geometric_types())? and (I assume) all_dialects() for the case above?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aargh, PostgreSQL seems to have issues still with <-4000 case and <=-4000. Seems broken in general. I will see if there is a better solution

@ayman-sigma
Copy link
Author

ayman-sigma commented Mar 17, 2026

Opened an external PR as it is general bug in the upstream: https://github.com/apache/datafusion-sqlparser-rs/pull/2280

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants