Skip to content

unpack-trees: skip lstats for deleted VFS entries in checkout#865

Open
tyrielv wants to merge 1 commit intomicrosoft:vfs-2.53.0from
tyrielv:tyrielv/checkout-hydration-fix
Open

unpack-trees: skip lstats for deleted VFS entries in checkout#865
tyrielv wants to merge 1 commit intomicrosoft:vfs-2.53.0from
tyrielv:tyrielv/checkout-hydration-fix

Conversation

@tyrielv
Copy link

@tyrielv tyrielv commented Mar 6, 2026

When core_virtualfilesystem is set and a branch switch deletes entries (present in old tree, absent in new tree), deleted_entry() calls verify_absent_if_directory() with 'ce' pointing to a tree entry from traverse_trees(). This tree entry lacks CE_NEW_SKIP_WORKTREE because that flag is only set on src_index entries by mark_new_skip_worktree().

The missing flag causes verify_absent_if_directory()'s fast-path to fail, falling through to verify_absent_1() which lstats every such path. In a VFS repo each lstat may trigger callbacks, creating placeholders. On a large repo switching between LTS releases this produces tens of thousands of placeholders that the VFS must then clean up when they are deleted as part of the checkout.

Fix this by propagating CE_NEW_SKIP_WORKTREE from the index entry (old) to the tree entry (ce) when core_virtualfilesystem is set. This allows the existing fast-path to work, eliminating the unnecessary lstats entirely.

Measured on a 2.8M file VFS repo (0% hydrated):
Before: ~135s checkout, ~23k folder placeholders created
After: ~25s checkout, 0 folder placeholders created

  • This change only applies to the virtualization hook and VFS for Git.

When core_virtualfilesystem is set and a branch switch deletes entries
(present in old tree, absent in new tree), deleted_entry() calls
verify_absent_if_directory() with 'ce' pointing to a tree entry from
traverse_trees(). This tree entry lacks CE_NEW_SKIP_WORKTREE because
that flag is only set on src_index entries by mark_new_skip_worktree().

The missing flag causes verify_absent_if_directory()'s fast-path to
fail, falling through to verify_absent_1() which lstats every such
path. In a VFS repo each lstat may trigger callbacks, creating
placeholders. On a large repo switching between LTS releases this
produces tens of thousands of placeholders that the VFS must then
clean up when they are deleted as part of the checkout.

Fix this by propagating CE_NEW_SKIP_WORKTREE from the index entry
(old) to the tree entry (ce) when core_virtualfilesystem is set.
This allows the existing fast-path to work, eliminating the
unnecessary lstats entirely.

This is safe in VFS mode because the virtual filesystem is responsible
for tracking which files are hydrated and cleaning up placeholders
when entries are removed from the index. Additionally, when
GVFS_NO_DELETE_OUTSIDE_SPARSECHECKOUT is set (always the case in VFS
repos), deleted_entry() preserves CE_SKIP_WORKTREE on the CE_REMOVE
entry and git does not unlink skip-worktree files from disk, so the
lstat result would not be acted upon anyway.

Measured on a 2.8M file VFS repo (0% hydrated):
  Before: ~135s checkout, ~23k folder placeholders created
  After:  ~25s checkout, 0 folder placeholders created

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tyrielv tyrielv marked this pull request as ready for review March 6, 2026 21:10
Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find! I think your code is correct. I have a suggested restructuring that I think will be equivalent, but I'm guessing you wrote it that way on purpose. I'm curious to hear what you think.

return 0;
} else if (verify_absent_if_directory(ce, ERROR_WOULD_LOSE_UNTRACKED_REMOVED, o)) {
return -1;
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an experiment, I wanted to see what changed if we wrote it like this:

} else if (core_virtualfilesystem &&
           old->ce_flags & CE_NEW_SKIP_WORKTREE) {
        ((struct cache_entry *)ce)->ce_flags |= CE_NEW_SKIP_WORKTREE;
} else if (verify_absent_if_directory(ce, ERROR_WOULD_LOSE_UNTRACKED_REMOVED, o)) {
        return -1;
}

This takes the guesswork out of it in terms of whether this is effective. But the key difference lies inside the implementation of verify_absent_if_directory() which only short-circuits on CE_NEW_SKIP_WORKTREE if o->skip_sparse_checkout is unset. I wonder if there is a world where core_virtualfilesystem and o->skip_sparse_checkout could ever be enabled at the same time.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right - it looks like the only current scenario where skip_sparse_checkout could be false when core_virtualfilesystem is true is if validation of the sparse checkout patterns fails, and we're already in trouble in that case.
I'm not sure whether it's necessary to still propagate the flag if we skip verify_absent_if_directory entirely. I'll look into it, but I'd also be fine if this merged in the meantime. (I don't have permissions to merge).
@dscho what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants