_apple_silicon_vram_gb() in test/predicates.py estimates usable GPU memory using a static formula:
min(total * 0.75, total - 16)
On a 32 GB machine this gives 16 GB, but psutil.virtual_memory().available at runtime shows ~18.5 GB actually free. More importantly, the formula is evaluated at pytest collection time with no knowledge of current system load, and
it doesn't align with how CUDA reports free VRAM (which the audit-markers skill's formula was written against).
Observed effect: test/stdlib/components/intrinsic/test_core.py has require_gpu(min_vram_gb=12), which the skill correctly computed using the adapter accumulation formula. However, on a 32 GB Apple Silicon machine the
predicate reports 16 GB available, so the gate passes. The model then runs under memory pressure and produces incoherent output (garbled JSON with raw token IDs), failing the test.
Proposed fix
Replace the static heuristic in _apple_silicon_vram_gb() with psutil.virtual_memory().available, which reflects actual free memory at collection time and is consistent with how CUDA reports free VRAM. The audit-markers skill
formula would then work correctly on both platforms without changes.
Branch: fix/audit-markers-vram-cross-check
Notes
psutil is already a project dependency (used by require_ram() in the same file)
- The CUDA path in
_gpu_vram_gb() reports total device memory, not free — may also be worth revisiting for consistency, but is a separate concern
_apple_silicon_vram_gb()intest/predicates.pyestimates usable GPU memory using a static formula:On a 32 GB machine this gives 16 GB, but
psutil.virtual_memory().availableat runtime shows ~18.5 GB actually free. More importantly, the formula is evaluated at pytest collection time with no knowledge of current system load, andit doesn't align with how CUDA reports free VRAM (which the
audit-markersskill's formula was written against).Observed effect:
test/stdlib/components/intrinsic/test_core.pyhasrequire_gpu(min_vram_gb=12), which the skill correctly computed using the adapter accumulation formula. However, on a 32 GB Apple Silicon machine thepredicate reports 16 GB available, so the gate passes. The model then runs under memory pressure and produces incoherent output (garbled JSON with raw token IDs), failing the test.
Proposed fix
Replace the static heuristic in
_apple_silicon_vram_gb()withpsutil.virtual_memory().available, which reflects actual free memory at collection time and is consistent with how CUDA reports free VRAM. Theaudit-markersskillformula would then work correctly on both platforms without changes.
Branch:
fix/audit-markers-vram-cross-checkNotes
psutilis already a project dependency (used byrequire_ram()in the same file)_gpu_vram_gb()reports total device memory, not free — may also be worth revisiting for consistency, but is a separate concern