fix: Scrub GIT_* env vars in hooks that shell to git clone#993
fix: Scrub GIT_* env vars in hooks that shell to git clone#993yuriipolishchuk wants to merge 3 commits into
Conversation
Hooks that invoke `tofu init` / `terraform init` (terraform_validate,
terraform_tflint, terraform_docs) end up running `git clone <module>`
under the hood for each registry/git-source module. When the
operator's `git commit` runs from a linked worktree, the parent git
sets GIT_INDEX_FILE (pointing at `.git/worktrees/<name>/index`) in the
hook subprocess env. The child `git clone` inherits this and writes the
cloned module's blob OIDs into the parent worktree's index. The next
`git commit` then fails at tree-build:
error: invalid object 100644 <oid> for '<path>'
error: Error building trees
Reproduction:
git init && git commit --allow-empty -m init
mkdir stack
cat > stack/main.tf <<'TF'
module "test" {
source = "terraform-aws-modules/iam/aws"
version = "5.x"
}
TF
cat > .pre-commit-config.yaml <<'YAML'
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.106.0
hooks:
- id: terraform_validate
args:
- --hook-config=--retry-once-with-cleanup=true
- --tf-init-args=-backend=false
YAML
git add -A && git commit -m baseline
git worktree add ../wt && cd ../wt
pre-commit install
echo "" >> stack/main.tf
git add stack/main.tf
git commit -m test # fails with 'Error building trees'
git fsck --no-progress # invalid sha1 pointer in cache-tree
Other GIT_* vars (GIT_DIR, GIT_WORK_TREE, GIT_OBJECT_DIRECTORY) are
overridden by `git clone`'s own target-dir setup; only GIT_INDEX_FILE
slips through. Scrubbing all four matches the four documented offenders
in pre-commit framework's own `no_git_env` helper
(`pre_commit/git.py:20-38`), which explicitly states:
# GIT_DIR: Causes git clone to clone wrong thing
# GIT_INDEX_FILE: Causes 'error invalid object ...' during commit
pre-commit framework deliberately does NOT scrub GIT_* from user hook
subprocesses (only its own internal git calls) per the maintainer in
pre-commit/pre-commit#1849: "they need the same code as in our
no_git_env helper if they are dealing with doing git writes". This
patch applies that recommendation in pre-commit-terraform's hooks
that perform git writes via `tofu init` / `terraform init`.
Verified by reproducing the failure on v1.106.0 from a worktree with
a module-using stack, then confirming the fix removes both the index
pollution and the tree-build error. SSH/TLS/config-related GIT_* env
vars are preserved (only the four documented "dangerous" ones are
unset) so SSH-source modules authenticate normally.
Net change: one new helper `common::scrub_git_env` in _common.sh,
called once from `main()` of each affected hook. Zero behavior change
for non-worktree users — their GIT_INDEX_FILE is empty for these hooks
so the unset is a no-op.
Refs: antonbabenko#992
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a ChangesGit environment scrubbing across hooks
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@hooks/_common.sh`:
- Around line 119-149: The comment above function common::scrub_git_env is
misleading because the function only unsets a targeted subset of GIT_* vars
(GIT_INDEX_FILE, GIT_DIR, GIT_WORK_TREE, GIT_OBJECT_DIRECTORY) rather than
reproducing pre-commit's no_git_env behavior; update the docblock to explicitly
state that this function intentionally clears only those specific offending
variables (and why each is included) and clarify that pre-commit's no_git_env
removes all GIT_* except a specific allowlist (e.g., GIT_CONFIG_KEY_*,
GIT_CONFIG_VALUE_*, GIT_EXEC_PATH, GIT_SSH, GIT_SSH_COMMAND, GIT_SSL_CAINFO,
GIT_SSL_NO_VERIFY, GIT_CONFIG_COUNT, GIT_HTTP_PROXY_AUTHMETHOD,
GIT_ALLOW_PROTOCOL, GIT_ASKPASS), so the comment does not claim parity with
no_git_env and instead documents the targeted subset behavior of
common::scrub_git_env.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ee9f744b-e72b-42de-9342-5072872f64a8
📒 Files selected for processing (4)
hooks/_common.shhooks/terraform_docs.shhooks/terraform_tflint.shhooks/terraform_validate.sh
Address CodeRabbit feedback on PR antonbabenko#993: the previous docstring read as though common::scrub_git_env mirrored pre-commit's no_git_env helper. It does not. no_git_env is allowlist-based: it scrubs all GIT_* except a whitelist of ~11 known-safe vars (GIT_EXEC_PATH, GIT_SSH, GIT_SSH_COMMAND, GIT_ASKPASS, GIT_SSL_*, GIT_CONFIG_KEY_*, GIT_CONFIG_VALUE_*, GIT_CONFIG_COUNT, GIT_HTTP_PROXY_AUTHMETHOD, GIT_ALLOW_PROTOCOL) and runs only on pre-commit's internal git calls. common::scrub_git_env is denylist-based: it unsets only four specific vars that pre_commit/git.py:20-38 documents as dangerous when leaked into child git (GIT_INDEX_FILE, GIT_DIR, GIT_WORK_TREE, GIT_OBJECT_DIRECTORY), and runs in user hook subprocesses. New docstring spells this out, explains why each of the four is included, and notes why we chose a denylist over an allowlist mirror (bash portability, observed bug fix scope, lower blast radius for hook authors with custom registry/proxy env vars).
| # rely on inheritance of vars an aggressive allowlist might drop. | ||
| ####################################################################### | ||
| function common::scrub_git_env { | ||
| unset GIT_INDEX_FILE GIT_DIR GIT_WORK_TREE GIT_OBJECT_DIRECTORY |
There was a problem hiding this comment.
- Please put vars in alphabetical order.
- Might it worth of handling
unsetfailure? E.g. if by any reason the var was declared readonly outside of the hook? 🤔 - I'd also suggest to unset Git vars only when the hook is run from inside the Git worktree so that we don't mess up non-worktree envs.
- Here's how I identify this locally:
local _git_repo_git_file="$(git rev-parse --show-cdup).git" [[ -f $_git_repo_git_file ]] && $(grep -qE ^gitdir: "$_git_repo_git_file") && local _is_git_worktree="true"
- Would we want to let user know what's happening as it's sort of intrusion to user-controlled area? 🤔 Maybe not unconditionally, but e.g. when
PCT_LOGis set to e.g.debug… @MaxymVlasov WDYT - Would it make sense to apply
common::scrub_git_env()to all hooks e.g. by making it part ofcommon::initialize()(or being called fromcommon::initialize())? 🤔 This way we would not need to care of, for example, adding it to any new hooks (if any). @MaxymVlasov WDYT?
| common::parse_cmdline "$@" | ||
| common::export_provided_env_vars "${ENV_VARS[@]}" | ||
| common::parse_and_export_env_vars | ||
| common::scrub_git_env |
There was a problem hiding this comment.
I can't reproduce the worktree bug in this hook.
| common::scrub_git_env |
| common::parse_cmdline "$@" | ||
| common::export_provided_env_vars "${ENV_VARS[@]}" | ||
| common::parse_and_export_env_vars | ||
| common::scrub_git_env |
There was a problem hiding this comment.
I can't reproduce the worktree bug in this hook.
| common::scrub_git_env |
There was a problem hiding this comment.
There another thing that shows, both for terraform_docs and terraform_fmt hooks.
But it's not an issue at all.
18:57 repro-hoo git:(repro-hoo +)
➜ gc
Terraform docs...........................................................Passed
Terraform validate with tflint...........................................Passed
Terraform fmt............................................................Failed
- hook id: terraform_fmt
- files were modified by this hook
main.tf
18:57 repro-hoo git:(repro-hoo !+)
✘1 ➜ gc
Terraform docs...........................................................Passed
Terraform validate with tflint...........................................Passed
Terraform fmt............................................................Passed
Terraform docs.......................................(no files to check)Skipped
Terraform validate with tflint.......................(no files to check)Skipped
Terraform fmt........................................(no files to check)Skipped
[repro-hoo 44cd287] fmt
2 files changed, 2 insertions(+), 6 deletions(-)
18:58 repro-hoo git:(repro-hoo) took 9s
➜ git fsck --no-progress | head
dangling tree bba236455e65937967031c5d25b02e11f5c252f5
18:58 repro-hoo git:(repro-hoo)
➜ git fsck --no-progress | head
dangling tree bba236455e65937967031c5d25b02e11f5c252f5
18:58 repro-hoo git:(repro-hoo)
➜ gc
Terraform docs...........................................................Failed
- hook id: terraform_docs
- files were modified by this hook
Terraform validate with tflint...........................................Passed
Terraform fmt............................................................Passed
18:58 repro-hoo git:(repro-hoo !+)
✘1 ➜ gc
Terraform docs...........................................................Passed
Terraform validate with tflint...........................................Passed
Terraform fmt............................................................Passed
Terraform docs.......................................(no files to check)Skipped
Terraform validate with tflint.......................(no files to check)Skipped
Terraform fmt........................................(no files to check)Skipped
[repro-hoo c115ce4] docs
2 files changed, 2 insertions(+), 2 deletions(-)
18:58 repro-hoo git:(repro-hoo) took 6s
➜ git fsck --no-progress | head
dangling tree a13f3782d35b8ce91a9d45f5b57acd390ea671a2
dangling tree bba236455e65937967031c5d25b02e11f5c252f5Per review on PR antonbabenko#993: - MaxymVlasov could not reproduce the worktree bug in terraform_tflint or terraform_docs (terraform-docs does not run init), so the common::scrub_git_env calls are removed from both. The confirmed reproduction path is terraform_validate -> tofu/terraform init -> git clone, which keeps its call. - Docstring reworded per yermulnik: drop 'dangerous' (the vars are incompatible with pre-commit's hook env, not dangerous), drop the var count, capitalize Git as a proper name, use full clickable URLs instead of repo-relative paths and issue shorthand, drop line-range references that rot as upstream moves. - GIT_OBJECT_DIRECTORY provenance corrected: it is not named in pre-commit's no_git_env docstring; it comes from Git's own repository-location environment variables (man 1 git). The docstring now cites that source for all four vars. - unset arguments alphabetized.
|
Thanks both for the thorough review - pushed 63ab2bc addressing it. Scope (@MaxymVlasov)Accepted both suggestions: The dangling trees you observed with Wording / references (@yermulnik)All applied in 63ab2bc:
Open design questions - deferring to you two
|
Description
Fixes #992.
Hooks that invoke
tofu init/terraform init(terraform_validate,terraform_tflint,terraform_docs) end up runninggit clone <module>under the hood for each registry/git-source module. When the operator'sgit commitruns from a linked git worktree, the parent git setsGIT_INDEX_FILE(pointing at.git/worktrees/<name>/index) in the hook subprocess env. The childgit cloneinherits this and writes the cloned module's blob OIDs into the parent worktree's index. The nextgit committhen fails at tree-build:Other
GIT_*vars (GIT_DIR,GIT_WORK_TREE,GIT_OBJECT_DIRECTORY) are overridden bygit clone's own target-dir setup; onlyGIT_INDEX_FILEslips through. Scrubbing all four matches the four documented offenders in pre-commit framework's ownno_git_envhelper.Why hook-side scrub (not framework-side)
pre-commit framework deliberately does NOT scrub
GIT_*from user hook subprocesses (only its own internal git calls) per the maintainer's stance in pre-commit/pre-commit#1849:asottile reaffirmed this in pre-commit/pre-commit#3492 — framework-level scrubbing breaks valid use cases (e.g., gitleaks-docker wants
GIT_INDEX_FILEpassed through). So per-hook-author opt-in is the design intent.This patch applies that recommendation in pre-commit-terraform's hooks that perform git writes via
tofu init/terraform init.A/B reproduction (same worktree, same
.tfedit, onlyrepo:URL changes)Same
.pre-commit-config.yamlshape, same staged.tfchange, samegit commitinvocation. Switchrepo:betweenantonbabenko/pre-commit-terraform@v1.106.0andyuriipolishchuk/pre-commit-terraform@fix/scrub-git-env:rev: v1.106.0Error building trees. 5 phantom blobs in worktree index.git fsckreports invalid sha1 in cache-tree.rev: fix/scrub-git-envgit fsckclean.Terraform validate: Passed.Full reproduction script:
Changes
hooks/_common.shcommon::scrub_git_envhelper. Bare 4-varunset(the four offenders documented in pre-commit'sno_git_env). Detailed docstring covering bug class, mechanism, and references.hooks/terraform_validate.shcommon::scrub_git_envcall inmain()after env setup, beforecommon::per_dir_hook.hooks/terraform_tflint.shtflint --initmay shell to git for plugin sources.hooks/terraform_docs.shterraform-docsmay invoketofu initto resolve module schemas.Net diff: +35 / -0 across 4 files. Zero behavior change for non-worktree users (their
GIT_INDEX_FILEis empty for these hooks →unsetis a no-op).Test plan
master(v1.106.0) from a worktree with a module-using stack → commit fails withError building trees, index has phantom blobsunsetis a no-op when env var absentGIT_SSH,GIT_ASKPASS,GIT_SSL_*etc. are NOT touched (only the four documented dangerous vars)shellcheckclean (no new warnings)Other notes
no_git_env(~15 linescaseblock in bash) but went with the minimal denylist (4 vars) since onlyGIT_INDEX_FILEactually causes the observed failure; the others are documented defensive offenders. Happy to expand to allowlist if you'd prefer a stricter mirror.GIT_*from the parentgit commit; this means the unset only takes effect for the duration of the hook (and its child processes). The parentgit commitenv is unaffected.terraform_fmt,terraform_trivy,terraform_checkov, etc.) are not touched — they don't have this failure mode.References
pre_commit/git.py:20-38—no_git_envsource + docstring documentingGIT_INDEX_FILE: Causes 'error invalid object ...' during commit