-
Notifications
You must be signed in to change notification settings - Fork 24
ci: Add GHA workflow executing benchmarks #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| cargo bench --package resolc --bench compile -- --save-baseline main_resolc | ||
| cargo bench --package revive-yul --bench parse -- --save-baseline main_yul_parse | ||
| cargo bench --package revive-yul --bench lower -- --save-baseline main_yul_lower | ||
| timeout-minutes: 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also remove the timeout-minutes and add it only if we notice unusually long runs. (This took about 6-7min on M4 Pro.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should narrow down the workflow trigger. Thinking about it, this will be generating a lot of CI hours. I don't think it's necessary to run this workflow in every PR. Is there a simple way to configure it to to be manually triggered by the PR author?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, also thought about that once I was off. I've rewritten the workflow to instead be manually triggered via the Actions UI. (See updated PR description.)
kvpanch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it known if CI machine is good for such perf testing ? For example, can CI machine run several tasks in parallel ? If so, perf results might not be stable and it might be required to run such perf testing on a dedicated machine
.github/workflows/benchmark.yml
Outdated
| git fetch origin main | ||
| git checkout -f main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that makes sense to parameterize it by allowing to specify branch or tag to test against. This will help to nicely measure improvements/regressions over some release.
Obviously, that can be done later, not in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be a neat and valuable feature, thanks. Can add in a follow-up 👍
Regarding your other comment and concurrency, jobs and workflows (not steps within jobs) can run concurrently by default. Which, thanks for the reminder, made me add a concurrency group where this workflow only runs one at a time for the given PR branch (and cancels in-progress ones if it's triggered again while in progress).
I'd have to look up whether another CI machine is still more optimial for perf testing tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Regarding parameterization, I've incorporated this into the rewrite of this workflow that I've now done in order to manually trigger this workflow on demand, rather than automatically on each PR which may not be necessary.
xermicus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm overall
.github/workflows/benchmark.yml
Outdated
|
|
||
| jobs: | ||
| benchmark: | ||
| runs-on: ubuntu-24.04 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to use the same host we benchmark the polkadot SDK with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like different ones are being used for benches in the Polkadot SDK, but two common ones are e.g. parity-large and parity-benchmark (both enabled in our revive repo). The differences can be seen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: Opted for parity-large in our case.
| ## Parse Yul | ||
|
|
||
| <details> | ||
| <summary>Results</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hiding the results is definitively what we want here in order to not make it too noisy. Can we also have a summary or something on top that tells PR authors quickly whether the benchmarks have regressed, improved or no significant change was detected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make a note of this for a follow-up 👍 Will likely require some parsing of critcmp's output. But def improves the readability.
Rather than having the long-running workflow be automatically triggered on each PR and PR update, we now manually trigger the workflow via the Actions UI. This also allows the user to compare benchmarks between any branches, tags, or SHAs by providing these values as input.
It seems as though parity-large and parity-benchmark are among some runners used in the Polkadot SDK. parity-large is chosen here based on the differences described here: https://github.com/paritytech/ci_cd/wiki/GitHub#paritytech-self-hosted-runners
|
I've slightly rewritten the workflow to instead be manually triggered, rather than running this on each PR and each of its updates. We can now also compare any branch name, SHA, or tag. (See updated PR description.) |
Description
Adds a workflow executing the benchmarks in
resolcandrevive-yul.main) (required)lj/ci-benchmarks) (required)This PR originally implemented the workflow to automatically run on each PR once its tests passed successfully. As we likely do not need to run this workflow for each PR, the workflow has been rewritten to manually trigger it.
Providing user input also allows for comparisons between any commits, such as releases.
Notes
File must exist on the default branch
workflow_dispatchevent only if the workflow file exists on the default branch. Therefore, the workflow cannot be run for this PR. (See section below on how it has still been tested.)Testing the Workflow
I've tested a similar enough workflow in a personal repo in order to push it to
mainand trigger it. Will verify again once this PR is merged.Example PR Comment
Expand to see an example PR comment
Benchmarks
mainlj/ci-benchmarksCompile E2E
Results
Parse Yul
Results
Lower Yul
Results
Resolved Issues
Closes #405