Skip to content

Flaky tests: repo-manager.test.ts times out on Windows (EBUSY file locks) #774

@christso

Description

@christso

Summary

Two tests in packages/core/test/evaluation/workspace/repo-manager.test.ts fail intermittently on Windows due to EBUSY file locks and temp directory cleanup race conditions.

Failing tests

  1. RepoManager > materialize > walks ancestor commits (line 102)
  2. RepoManager > materialize > supports shallow clone with depth (line 122)

Reproduction steps

# On Windows (tested on Windows 11 Enterprise 10.0.26100)
cd packages/core
bun test test/evaluation/workspace/repo-manager.test.ts

The tests pass individually but fail when run as part of the full suite. The failure pattern:

  1. A preceding test creates a temp repo at C:\Users\...\AppData\Local\Temp\repo-manager-test-XXXXXX
  2. afterEach cleanup tries to rm the temp directory
  3. rm fails with EBUSY: resource busy or locked because git processes still hold file handles
  4. The next test (walks ancestor commits) starts with the stale temp dir, times out at 5000ms
  5. The following test (shallow clone with depth) tries git clone --depth 2 file://... against a temp repo that was already partially cleaned up, gets fatal: does not appear to be a git repository

Error output

EBUSY: resource busy or locked, rm 'C:\Users\...\AppData\Local\Temp\repo-manager-test-uprPeE'
    path: "C:\Users\...\AppData\Local\Temp\repo-manager-test-uprPeE",
    syscall: "rm",
    errno: -16,
    code: "EBUSY"

(fail) RepoManager > materialize > walks ancestor commits [5063.00ms]
  ^ this test timed out after 5000ms.

(fail) RepoManager > materialize > supports shallow clone with depth [12641.00ms]
  ^ this test timed out after 5000ms.

Root cause

Windows holds file locks longer than Unix after git subprocess exits. The afterEach cleanup races with git process shutdown. This is a known Windows behavior with file:// protocol git operations on temp directories.

Possible fixes

  1. Retry cleanup with backoff — retry rm in afterEach up to 3 times with 500ms delay
  2. Unique temp dirs per test — use mkdtemp per test instead of sharing tmpDir across the describe block, so cleanup failures don't cascade
  3. Increase timeout — raise from 5000ms to 15000ms for these tests (masks the issue but prevents CI failures)
  4. Kill git processes before cleanup — explicitly wait for git subprocesses to exit before removing temp dirs

Option 2 is the cleanest fix — it isolates tests so one cleanup failure doesn't cascade.

Environment

  • Windows 11 Enterprise 10.0.26100
  • bun 1.3.5
  • git 2.x
  • 1158 tests pass, 2 fail (both in this file)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions