-
Notifications
You must be signed in to change notification settings - Fork 48
Support shallow clones with Git #772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e4b6b56 to
bc5cc0f
Compare
7cba9a9 to
d58fa3c
Compare
edeb769 to
5387210
Compare
d609819 to
1848a5e
Compare
50f6468 to
3a746a0
Compare
dde7615 to
3f72a12
Compare
src/taskgraph/run-task/run-task
Outdated
|
|
||
| # If we have a shallow clone and specific commit, we need to fetch it too. | ||
| if shallow and head_rev and head_rev != head_ref: | ||
| git_fetch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ideally we'd call git fetch just once against the head repo, i.e. combine this with the head_ref fetch
| if not targets or shallow: | ||
| # If head_ref wasn't provided, we fallback to head_rev. If we have a | ||
| # shallow clone, head_rev needs to be fetched independently regardless. | ||
| targets.append(head_rev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we assert somewhere that if shallow is True then we have a head_rev?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, good point. I guess a head_rev isn't necessary for shallow clones either though.. I'll fix this up.
…head_rev This makes the naming consistent with what we use in .taskcluster.yml and the rest of Taskgraph. Previously, I always had to look up where "ref" and "commit" / "revision" were coming from to double check they were the values I was expecting. This rename makes that much more obvious.
If the condition in the if statement is true, then we've already fetched ref from head_repo. There's no need to do so again.
BREAKING CHANGE: `base_ref` will no longer be fetched or checked out by run-task Taskgraph uses base_rev anyway for computing files changed, so there's no need to additionally fetch base_ref. Some tasks may need to be updated to not rely on base_ref being present in the local clone.
BREAKING CHANGE: omitting `head_ref` no longer fetches all heads Previously we were fetching all heads in this case so that we could then run `git checkout <head_rev>` successfully. But it's much faster to just explicitly fetch `<head_rev>` in the first place. This also refactors `git_fetch` to be able to fetch multiple targets at once.
This fixes the case where head_ref is passed in with a `refs/heads` prefix.
Shallow clones yield a massive improvement to clone performance, at the expense of making it tricky to determine the files that were modified.
`git log BASE..HEAD` says, show me commits reachable from HEAD, but not reachable from BASE. In a shallow clone where we only fetch BASE and HEAD (which is what run-task does), this means the command will only return `HEAD`. In otherwords, we're only returning files changed by the tip commit of the push and ignoring everything else. By switching to `git diff BASE HEAD`, we're instead comparing the snapshots of both revisions. Sometimes this is what we want, e.g for force pushes, it'll be the interdiff of files modified between the two pushes (though some developers might expect it to contain the files modified since the merge base). Sometimes it's not what we want, e.g for PRs, it'll be the files changed between the PR and the latest commit on `main`. Either way, this behaviour is at least somewhat more accurate than git log when we don't have full history. Likely we'll need to fetch the proper changed files using the Github API in the future, but for now this is better than nothing.
No description provided.