perf: O(n) spatial indexing, vectorized planners, sparse storage by OldCrow · Pull Request #3 · tjards/multi-agent_sim

OldCrow · 2026-04-14T03:00:56Z

Summary

Reduces per-timestep complexity from O(n²) to O(n log n) for neighbor discovery and O(n) for bounded-density interactions, enabling simulations at 500–1000+ agents where the original was limited to ~100.

All upstream functionality is preserved. Optimizations activate automatically; set "use_optimized": false in config/config.json to revert to the original code paths. No new external dependencies.

Changes

Performance:

Spatial indexing (utils/spatial.py): KD-tree neighbor discovery replaces O(n²) brute-force scans
Sparse graph (utils/swarmgraph.py): scipy.sparse.csgraph for components; node-degree connectivity replaces exponential path enumeration; sparse adjacency with lazy dense cache
Vectorized planners: all 8 strategies optimized — saber/reynolds/starling fully vectorized via batched NumPy; encirclement/lemniscates via broadcast; pinning/shepherding/malicious via pre-built neighbor lists (sequential learning logic preserved)
Sparse data storage (data/data_manager.py): n×n arrays stored as scipy.sparse per timestep
Vectorized order() metric and plotting computations

Bug fixes:

Heap fragmentation: np.zeros((3, n)) replaced with np.zeros(3) and np.identity(3) replaced with pre-allocated _I3 in all per-agent methods. Original caused ~185 MB/100 steps RSS growth at n=200 due to CPython pymalloc arena fragmentation.
Distance-from-target plot: position-only norm (0:3) corrects dimensionally incorrect 6D norm that included velocity
Escape sequence warnings eliminated across codebase
Tracked __pycache__/*.pyc files removed

Documentation:

OPTIMIZATION.md: problem statement, methodology, per-file changes, design decisions (why 3 planners aren't fully vectorized), heap fragmentation analysis, validated benchmarks, future work

Tests: 78 tests + 7 benchmark scripts

Validated Results

Metric	Before	After
Full sim n=100 (Tf=300s)	12.6 min	52.5s (14.4× speedup)
Full sim n=200 (Tf=300s)	~43.5 min	1.4 min (32× speedup)
Graph ops n=500	474ms	1.96ms (242× speedup)
Memory n=1000 (10s sim)	15.0 GB	76.6 MB (201× reduction)
Scaling n=50→1000	O(n^1.79)	O(n^1.0–1.15)
RSS stability (n=200, 15K steps)	+925 MB (OOM)	+2.9 MB (flat)

Full benchmark data in OPTIMIZATION.md and reproducible via tests/benchmark_*.py.

Co-Authored-By: Oz oz-agent@warp.dev

Reduces per-timestep complexity from O(n^2) to O(n log n) for neighbor discovery and O(n) for bounded-density interactions, enabling simulations at 500-1000+ agents where the original was limited to ~100. Performance changes: - Spatial indexing (utils/spatial.py): KD-tree neighbor discovery - Sparse graph (utils/swarmgraph.py): csr_matrix adjacency, scipy connected components, node-degree connectivity, lazy dense cache - Vectorized planners: saber/reynolds/starling via batched NumPy; encirclement/lemniscates via broadcast; pinning/shepherding/malicious via pre-built neighbor lists (sequential logic preserved) - Sparse History storage: scipy.sparse per timestep, COO HDF5 format - Vectorized metrics (order) and plotting computations - Orchestrator: vectorized dispatch with use_optimized config toggle Bug fixes: - Heap fragmentation: np.zeros(3) replaces np.zeros((3,n)) and pre-allocated _I3 replaces per-call np.identity(3) in all planners. Original caused ~185 MB/100 steps RSS growth at n=200. - Distance-from-target plot: position-only norm (0:3) corrects dimensionally incorrect 6D norm including velocity - Escape sequence warnings eliminated across codebase - Tracked __pycache__/*.pyc files removed Documentation: - OPTIMIZATION.md: problem statement, methodology, per-file changes, design decisions, heap fragmentation analysis, benchmarks, future work - README.md: tests/ in project structure, use_optimized docs Tests and benchmarks: - 78 tests: spatial index, graph equivalence, planner vectorized-vs- scalar, HDF5 round-trip, plotting computations - 7 benchmark scripts: graph ops, e2e sim, memory, scaling, extended, diverse agents, full 300s simulation, memory diagnostics Validated results: - Full sim n=100: 52.5s (was 12.6 min, 14.4x speedup) - Full sim n=200: 1.4 min (32x speedup) - Scaling n=50-1000: O(n^1.0-1.15), memory stable - Graph ops n=500: 242x speedup - Memory n=1000: 201x reduction (76.6 MB vs 15 GB) No new external dependencies (uses scipy already in requirements.txt). Co-Authored-By: Oz <oz-agent@warp.dev>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: O(n) spatial indexing, vectorized planners, sparse storage#3

perf: O(n) spatial indexing, vectorized planners, sparse storage#3
OldCrow wants to merge 1 commit intotjards:masterfrom
OldCrow:master

OldCrow commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OldCrow commented Apr 14, 2026

Summary

Changes

Validated Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant