predicates.py Apple Silicon VRAM estimate uses static heuristic instead of actual available memory, causing GPU-gated tests to run on under-resourced hardware

`_apple_silicon_vram_gb()` in `test/predicates.py` estimates usable GPU memory using a static formula:                                                                                                                                   
                                                                                                                                                                                                                                         
    min(total * 0.75, total - 16)                                                                                                                                                                                                        
                                                                                                                                                                                                                                         
On a 32 GB machine this gives 16 GB, but `psutil.virtual_memory().available` at runtime shows ~18.5 GB actually free. More importantly, the formula is evaluated at pytest collection time with no knowledge of current system load, and 
it doesn't align with how CUDA reports free VRAM (which the `audit-markers` skill's formula was written against).                                                                                                                      
                                                                                                                                                                                                                                         
**Observed effect:** `test/stdlib/components/intrinsic/test_core.py` has `require_gpu(min_vram_gb=12)`, which the skill correctly computed using the adapter accumulation formula. However, on a 32 GB Apple Silicon machine the         
predicate reports 16 GB available, so the gate passes. The model then runs under memory pressure and produces incoherent output (garbled JSON with raw token IDs), failing the test.
                                                                                                                                                                                                                                         
**Proposed fix**                                                                                                                                                                                                                       

Replace the static heuristic in `_apple_silicon_vram_gb()` with `psutil.virtual_memory().available`, which reflects actual free memory at collection time and is consistent with how CUDA reports free VRAM. The `audit-markers` skill   
formula would then work correctly on both platforms without changes.
                                                                                                                                                                                                                                         
**Branch:** `fix/audit-markers-vram-cross-check`                                                                                                                                                                                       

**Notes**
- `psutil` is already a project dependency (used by `require_ram()` in the same file)
- The CUDA path in `_gpu_vram_gb()` reports *total* device memory, not free — may also be worth revisiting for consistency, but is a separate concern

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

predicates.py Apple Silicon VRAM estimate uses static heuristic instead of actual available memory, causing GPU-gated tests to run on under-resourced hardware #783

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

predicates.py Apple Silicon VRAM estimate uses static heuristic instead of actual available memory, causing GPU-gated tests to run on under-resourced hardware #783

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions