perf: O(n) spatial indexing, vectorized planners, sparse storage#3
Open
OldCrow wants to merge 1 commit intotjards:masterfrom
Open
perf: O(n) spatial indexing, vectorized planners, sparse storage#3OldCrow wants to merge 1 commit intotjards:masterfrom
OldCrow wants to merge 1 commit intotjards:masterfrom
Conversation
Reduces per-timestep complexity from O(n^2) to O(n log n) for neighbor discovery and O(n) for bounded-density interactions, enabling simulations at 500-1000+ agents where the original was limited to ~100. Performance changes: - Spatial indexing (utils/spatial.py): KD-tree neighbor discovery - Sparse graph (utils/swarmgraph.py): csr_matrix adjacency, scipy connected components, node-degree connectivity, lazy dense cache - Vectorized planners: saber/reynolds/starling via batched NumPy; encirclement/lemniscates via broadcast; pinning/shepherding/malicious via pre-built neighbor lists (sequential logic preserved) - Sparse History storage: scipy.sparse per timestep, COO HDF5 format - Vectorized metrics (order) and plotting computations - Orchestrator: vectorized dispatch with use_optimized config toggle Bug fixes: - Heap fragmentation: np.zeros(3) replaces np.zeros((3,n)) and pre-allocated _I3 replaces per-call np.identity(3) in all planners. Original caused ~185 MB/100 steps RSS growth at n=200. - Distance-from-target plot: position-only norm (0:3) corrects dimensionally incorrect 6D norm including velocity - Escape sequence warnings eliminated across codebase - Tracked __pycache__/*.pyc files removed Documentation: - OPTIMIZATION.md: problem statement, methodology, per-file changes, design decisions, heap fragmentation analysis, benchmarks, future work - README.md: tests/ in project structure, use_optimized docs Tests and benchmarks: - 78 tests: spatial index, graph equivalence, planner vectorized-vs- scalar, HDF5 round-trip, plotting computations - 7 benchmark scripts: graph ops, e2e sim, memory, scaling, extended, diverse agents, full 300s simulation, memory diagnostics Validated results: - Full sim n=100: 52.5s (was 12.6 min, 14.4x speedup) - Full sim n=200: 1.4 min (32x speedup) - Scaling n=50-1000: O(n^1.0-1.15), memory stable - Graph ops n=500: 242x speedup - Memory n=1000: 201x reduction (76.6 MB vs 15 GB) No new external dependencies (uses scipy already in requirements.txt). Co-Authored-By: Oz <oz-agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces per-timestep complexity from O(n²) to O(n log n) for neighbor discovery and O(n) for bounded-density interactions, enabling simulations at 500–1000+ agents where the original was limited to ~100.
All upstream functionality is preserved. Optimizations activate automatically; set
"use_optimized": falseinconfig/config.jsonto revert to the original code paths. No new external dependencies.Changes
Performance:
utils/spatial.py): KD-tree neighbor discovery replaces O(n²) brute-force scansutils/swarmgraph.py):scipy.sparse.csgraphfor components; node-degree connectivity replaces exponential path enumeration; sparse adjacency with lazy dense cachedata/data_manager.py): n×n arrays stored asscipy.sparseper timesteporder()metric and plotting computationsBug fixes:
np.zeros((3, n))replaced withnp.zeros(3)andnp.identity(3)replaced with pre-allocated_I3in all per-agent methods. Original caused ~185 MB/100 steps RSS growth at n=200 due to CPython pymalloc arena fragmentation.__pycache__/*.pycfiles removedDocumentation:
OPTIMIZATION.md: problem statement, methodology, per-file changes, design decisions (why 3 planners aren't fully vectorized), heap fragmentation analysis, validated benchmarks, future workTests: 78 tests + 7 benchmark scripts
Validated Results
Full benchmark data in
OPTIMIZATION.mdand reproducible viatests/benchmark_*.py.Co-Authored-By: Oz oz-agent@warp.dev