-
Notifications
You must be signed in to change notification settings - Fork 1.9k
CI: build and run sqllogictests binary directly in extended workflow #20282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Set incremental to false for release-nonlto profile - Upgrade optimization level to 3 - Retain debug information for flamegraphs by setting strip to false
fe8af19 to
c5da7ca
Compare
990cf0e to
c5da7ca
Compare
c5da7ca to
8d9bcb8
Compare
This reverts commit 8d9bcb8.
d6fa2ec to
cd967ba
Compare
.github/workflows/extended.yml
Outdated
| rust-version: stable | ||
| - name: Build sqllogictest binary | ||
| run: | | ||
| TEST_BIN=$(cargo test --features backtrace,parquet_encryption --profile release-nonlto --test sqllogictests --no-run --message-format=json | sed -n 's/.*"executable":"\([^"]*\)".*/\1/p' | head -n 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than cargo test another idea would be to use cargo build 🤔
Also, what is the reason to use --message-format=json?
Finally, what is the head command for? I didn't see any obvious reason in
https://github.com/apache/datafusion/actions/runs/21890300850/job/63194519053
Perhaps you could add a comment explaining what those commands are for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good suggestions. I agree cargo build is clearer here since this step only compiles the sqllogictests target and does not run it.
I used --message-format=json to capture the exact emitted test binary path because Cargo places test executables under target/.../deps with a hash suffix. The head -n 1 was a simple way to select the first extracted executable path from the stream.
I’ll simplify and document this more clearly in the workflow so the intent is explicit.
.github/workflows/extended.yml
Outdated
| - name: Run sqllogictest | ||
| run: | | ||
| cargo test --features backtrace,parquet_encryption --profile release-nonlto --test sqllogictests -- --include-sqlite | ||
| ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the ( and )?
Also, why does it need to do cd when the current version doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cd is needed because sqllogictests resolves test data via relative paths (for example test_files/ and ../../datafusion-testing/data/ in datafusion/sqllogictest/bin/sqllogictests.rs), so running from repo root can point it at the wrong locations. I added it after an earlier version without cd errored with test_files/ not found.
The subshell (...) was only used to scope cd so it would not affect the subsequent cargo clean. It is not strictly necessary.
I’ll remove the subshell and switch to a cleaner step-level working-directory: datafusion/sqllogictest for the test execution step.
|
thanks for looking into this @kosiew |
014a28d to
882251a
Compare
Which issue does this PR close?
Rationale for this change
Extended CI runs have been taking 2+ hours, with
sqllogictestsidentified as a major contributor to the regression (thanks to @nuno-faria for identifying the contributor). This PR adjusts the extended workflow to separate building thesqllogicteststest binary from executing it, allowing the run step to execute the already-built test executable directly.This helps reduce workflow overhead from re-invoking
cargo testfor execution (and makes it easier to distinguish compilation time from test runtime when investigating performance regressions).What changes are included in this PR?
In the
sqllogictestsjob:sqllogicteststest binary once usingcargo test buildand capture the produced executable path.datafusion/sqllogictestwith--include-sqlite.Are these changes tested?
Yes.
Before fix
source: without current PR
After fix
source: current PR
(These changes are CI-only; functional behavior of DataFusion is unchanged.)
Are there any user-facing changes?
No. This PR only updates CI workflow behavior.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.