-
Notifications
You must be signed in to change notification settings - Fork 56
chore: add xtask to download regression tests #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👷 Deploy request for squawkhq pending review.Visit the deploys page to approve it
|
|
Ah I really like this idea!! Checking against postgres is a solid plan I was also thinking in the future, a similar thing where we check against some open source projects to make sure we're parsing things properly! I wonder if there's a way to make it not save the 1.7 million lines in the repo? Maybe it doesn't matter though -- if postgres is doing it it's probably fine? |
|
Yeah putting it in the repo is good, easier that way I think doing it in two PRs would be easier to review, one for the rust stuff to download and run the tests and then one that actually saves all the SQL & snapshots! |
|
Are |
It's supported in psql and used oftentimes when executing sql files. Reference here: https://www.postgresql.org/docs/current/app-psql.html |
Yeah will clean it up! Was thinking maybe we don't want to store snapshots of successful attempts for now? This would reduce the loc count significantly. The failing ones should also not fail the test (for now). I would use their output as a reference when fixing compatibility issues. wdyt? I believe the snapshots tests are not of much value since we are only checking whether the parser can parse valid sql and not whether the output is actually correct? |
|
I think we should support template parameters in different variations too: supabase-community/postgres-language-server#396 |
|
Yeah skipping snapshots for the postgres' SQL makes sense, we can always add to the existing test suite if necessary! And yeah we don't want to fail CI with the current incompatibility against the postgres test suite covering those params sounds good, in the future, I want to support parsing SQL inside string literals in other languages, like python, so we'd need to support things like:
We might not even need to update the lexer for all of them, can do it piecemeal |
|
Sounds good! Will open another pr with just the script then. If you add support for one of these params, I can also take over the rest. Would then also go through the remaining failing tests and fix what needs to be fixed. |
| .collect(); | ||
|
|
||
| if urls.is_empty() { | ||
| return Err(anyhow::anyhow!("No SQL files found")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return Err(anyhow::anyhow!("No SQL files found")); | |
| bail!("No SQL files found"); |
| return Err(anyhow::anyhow!( | ||
| "Failed to fetch SQL files: {}", | ||
| String::from_utf8_lossy(&output.stderr) | ||
| )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return Err(anyhow::anyhow!( | |
| "Failed to fetch SQL files: {}", | |
| String::from_utf8_lossy(&output.stderr) | |
| )); | |
| bail!( | |
| "Failed to fetch SQL files: {}", | |
| String::from_utf8_lossy(&output.stderr) | |
| ); |
| return Err(anyhow::anyhow!( | ||
| "Failed to download '{}': {}", | ||
| url, | ||
| error_msg | ||
| )); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return Err(anyhow::anyhow!( | |
| "Failed to download '{}': {}", | |
| url, | |
| error_msg | |
| )); | |
| bail!( | |
| "Failed to download '{}': {}", | |
| url, | |
| error_msg | |
| ); |
| use anyhow::Result; | ||
| use std::fs::{create_dir_all, remove_dir_all, File}; | ||
| use std::io::{BufRead, Cursor, Write}; | ||
| use std::path::PathBuf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you use use camino::Utf8PathBuf; instead of PathBuf you'll avoid some of the utf8 nonsense but doesn't matter that much
| for line in source.lines() { | ||
| let mut line = line?; | ||
|
|
||
| // Detect the start of the COPY block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does a copy block look like? curious what it has that we don't support parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the regression test is executed via psql, and they use COPY to load data into test tables FROM STDIN
COPY bitwise_test FROM STDIN NULL 'null';
1 1 1 1 1 B0101
3 3 3 null 2 B0100
7 7 7 3 4 B1100
\.|
Looks good, left some minor style nits! |
|
Hey @sbdchd, I am not able to merge or add the automerge label, but this is ready |
sbdchd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank you!!

EDIT: Removed the tests - this PR is just the
xtasknow :) can't tag you as a reviewer for some reason, but this is ready @sbdchdadds the postgress regression test suite to stress-test the parser:
xtaskthat downloads and preprocesses the sql files from the postgres reposquawk_parsercrate that runs all of them through the parserIn the preprocessing step, we remove everything that we do not want to support (yet), mainly plpgsql commands such as
\gset.Right now, all test fail. I peeked through some of the tests and it's mainly due to
:namevars. I think it makes sense to lex them asIDENTtokens? Happy to add support for this.Overall I think this test suite can help stabilise the parser. My hope is that it becomes stable enough to replace
libpg_queryin the postgres language server.Also: not sure where to put the tests since currently they all are part of the crate itself. For this kind of tests, I usually integrate them as integration tests. Let me know what you prefer! :)
also happy to not add them at all, or maybe not do snapshot testing, or maybe download the files and run the tests on-demand to not pollute the codebase! maybe a separate xtask that can be executed on-demand in CI / locally makes sense too - like https://areweturboyet.com