Note: This documentation auto-generated by Claude Haiku 4.5 (Agent) on 2026-03-14. Use with caution.
Project: Multi-Agent Coordination Simulator
Version: 2.0
Author: P.T. Jardine, PhD
Description: An open architecture multi-agent simulator for use by academic researchers.
- Project Overview
- Core Architecture
- Configuration System
- Agents Module
- Planner Techniques
- Orchestrator
- Simulation Parameters
- Learning Modules
- Obstacles & Targets
- Visualization & Data
- Utilities & Graph Tools
- Running Simulations
This simulator implements decentralized, asynchronous multi-agent coordination strategies based on flocking, lattice formation, and other swarm behaviors. Key principles:
- Decentralized: No global controller; each agent decides independently
- Asynchronous: Agents update at different times based on local information
- Local Information Only: Each agent only knows about nearby neighbors within sensing range
- Double Integrator: Simplified model (position & velocity)
- Quadcopter: Full quadrotor aerodynamics with nested control loops (velocity, attitude, angular rate)
- 2D and 3D simulations
- Various agent shapes and visual representations
- Multiple swarming techniques (8 implemented)
- Heterogeneous lattice formation with learning
- Obstacle avoidance and wall collision detection
- Malicious agent detection and mitigation
- Reinforcement learning integration (CALA, Q-learning)
The simulator follows a layered architecture:
┌──────────────────────────────────────┐
│ Visualization & Plotting │ (visualization/)
├──────────────────────────────────────┤
│ Data Manager & Experiment │ (data/, experiments/)
│ Orchestration │
├──────────────────────────────────────┤
│ Learning Modules (CALA, Q-learning)│ (learner/)
├──────────────────────────────────────┤
│ Orchestrator (Master Controller) │ (orchestrator.py)
├──────────────────────────────────────┤
│ Planner Techniques │ (planner/techniques/)
├──────────────────────────────────────┤
│ Agents | Targets | Obstacles | Graph│ (agents/, targets/, obstacles/, utils/)
├──────────────────────────────────────┤
│ Configuration Layer │ (config/config.json)
└──────────────────────────────────────┘
- Initialization (
main.py): Load configuration, build system components - Simulation Loop: Run for
TitoTfseconds with timestepTs- Update agent positions and velocities
- Compute sensing/connection graphs
- Plan trajectories based on strategy
- Compute control commands
- Update learning modules (if enabled)
- Record data
- Finalization: Save data, generate visualizations
All simulation parameters are centralized in config/config.json. The configuration is organized into logical sections:
{
"project": {...}, // Metadata
"simulation": {...}, // Overall simulation parameters
"agents": {...}, // Agent properties
"targets": {...}, // Target definitions
"obstacles": {...}, // Obstacle configurations
"orchestrator": {...}, // Master controller settings
"planner": {...}, // Planner-specific parameters
"learner": {...}, // Learning module parameters
"data": {...} // Data recording options
}Location: agents/
Agents are initialized from agents.py and can use one of two dynamics models.
class Agents:
- .state # [x0..xn, y0..yn, z0..zn, vx0..vxn, vy0..vyn, vz0..vzn]
- .nAgents # Number of agents
- .rAgents # Physical radius of each agent
- .centroid # Center of mass of the swarm
- .vmax / .vmin # Velocity constraintsConfiguration Parameter: agents.dynamics = "double integrator"
Equations of Motion:
ẋ = v (velocity is state)
v̇ = u (acceleration is control input)
Parameters:
nAgents: Number of agents in the swarmrAgents: Physical radius of agents (collision avoidance buffer)iSpread: Initial spread distance - agents randomly distributed within ±iSpreadinit_conditions: How to initialize agent positions"random": Uniformly random within iSpread"mesh": Regular grid arrangement"evenly_spaced": Manually specified positions
vmax/vmin: Velocity magnitude constraints (±5 m/s typical)
Configuration Parameter: agents.dynamics = "quadcopter"
Full 6-DOF nonlinear quadrotor model with:
- Thrust from 4 rotors
- 3 nested control loops:
- Velocity Control: Converts desired translational acceleration → attitude command
- Attitude Control: Converts attitude error → angular velocity command
- Angular Velocity Control: Converts angular velocity error → motor commands
Additional Quadcopter Config (agents/quadcopter_module/config.py):
- Mass, inertia matrix
- Motor parameters and limits
- Control gains for each loop
- Rotor speed bounds
Location: planner/techniques/
All planners inherit from BasePlanner and implement a common interface:
def compute_cmd(self, states, targets, index, **kwargs):
"""Compute control command for agent `index`"""
return u # acceleration/command vectorModule: flocking_saber.py
Strategy Name: "flocking_saber"
Purpose: Distributed flocking with collision avoidance. Agents reach consensus on velocity while maintaining desired inter-agent distance.
Algorithm Components:
u = u_int + u_nav + u_obs
u_int = agent-agent interaction (repulsion + velocity alignment)
u_nav = navigation toward target
u_obs = obstacle/wall avoidance
Configuration Parameters (planner.techniques.flocking_saber):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
a, b |
float | 5.0 | Uneven sigmoid parameters for smooth potential functions |
eps |
float | 0.1 | Regularization constant (prevents singularities in smooth norms) |
h |
float | 0.2 | Bump function parameter (smooth transition zone width) |
c1_a, c2_a |
float | 1.0, 2.0 | Agent-agent interaction gains: position and velocity coupling coefficients |
c1_b, c2_b |
float | 0.0 | Obstacle avoidance gains: typically zero if no obstacles |
c1_g, c2_g |
float | 2.0, 4.472 | Navigation (goal) gains: strength of target tracking |
d |
float | 10.0 | Desired inter-agent distance (formation spacing) |
d_prime |
float | 6.0 | Target-obstacle distance (safety margin from obstacles) |
r |
float | 13.0 | Sensing range (communication/detection radius for neighbors) |
r_prime |
float | 7.8 | Obstacle sensing range |
Use Cases: General consensus, formation flying, lattice assembly
Module: flocking_reynolds.py
Strategy Name: "flocking_reynolds"
Purpose: Classic Reynolds flocking with three rules: separation (avoid crowding), alignment (match velocity), cohesion (stay together). Simpler than Olfati-Saber.
Algorithm Components:
u = Σ [w_sep·u_sep + w_align·u_align + w_coh·u_coh] + u_target
Configuration Parameters (planner.techniques.flocking_reynolds):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
escort |
{0,1} | 1 | Target tracking mode: 1 = track moving target, 0 = follow swarm centroid |
cd_1 |
float | 0.3 | Cohesion weight (move toward centroid of neighbors) |
cd_2 |
float | 0.4 | Alignment weight (match neighbor velocities) |
cd_3 |
float | 0.2 | Separation weight (repel from crowded neighbors) |
cd_track |
float | 0.2 | Target tracking weight (only used if escort=1) |
maxu |
float | 10 | Max acceleration magnitude per rule |
maxv |
float | 100 | Max velocity magnitude |
recovery |
{0,1} | 0 | Auto-recovery: trigger if swarm disperses beyond far_away |
far_away |
float | 300 | Dispersal threshold (triggers recovery if exceeded) |
mode_min_coh |
{0,1} | 0 | Enforce minimum cohesion: require at least agents_min_coh neighbors in sight |
agents_min_coh |
int | 2 | Minimum cohesion group size |
r |
float | 10 | Neighbor sensing range |
r_prime |
float | 5 | Separation/collision avoidance range |
Use Cases: Schooling behavior, swarm aggregation, simple coordinated movement
Module: lemniscates.py
Strategy Name: "lemniscates"
Purpose: Agents follow lemniscate (figure-8) or Gerono curves around a target, creating dynamic, flowing patterns. Can be combined with reinforcement learning to optimize circular path radius.
Algorithm Components:
u = -c1_d·(p - p_desired) - c2_d·v
where p_desired follows lemniscate curve around target
Lemniscate Types:
- Type 0: Gerono (surveillance) - rotates smoothly around target
- Type 1: Gerono (rolling) - rolling motion pattern
- Type 2: Gerono (mobbing) - enclosing pattern with vertical offset
- Type 3-5: Explicit curves (Dumbbell, Bernoulli) for diversity
Configuration Parameters (planner.techniques.lemniscates):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
c1_d, c2_d |
float | 1.0, 2.0 | Tracking gains: position and velocity coupling for trajectory tracking |
lemni_type |
{0-5} | 0 | Lemniscate shape: 0=Gerono surveillance, 1=rolling, 2=mobbing, 3-5=other curves |
learning |
str | null | Learning method: null (disabled) or "CALA" (Collaborative Adaptive Learning Algorithm) |
learning_axes |
str | "xz" | Learning dimensions: "x" (sagittal), "z" (vertical), or "xz" (coupled) |
learning_coupling |
bool | true | Coupled learning: if true, x and z parameters are linked (recommended for xz) |
Learning Integration (CALA):
- Agents adaptively adjust the lemniscate size during simulation
- Reward signal: agreement with neighbors on path parameters
- Only works with
lemni_type=0(Gerono surveillance)
Use Cases: Surveillance patterns, dynamic swarm choreography, continuous circular coverage
Module: encirclement.py
Strategy Name: "encirclement"
Purpose: Forms agents into a perfect circle around a target and rotates them together with controlled angular velocity. Maintains even spacing on the circle perimeter.
Algorithm Components:
u = -c1_d·(p - p_circle) - c2_d·(v - v_circle)
where p_circle is the agent's assigned position on the circle
v_circle is the required velocity for circular motion
Formation Maintenance:
- Agents assigned angular positions around target
- Leading/lagging pairs maintain relative spacing
- Angular velocity synchronized across swarm
Configuration Parameters (planner.techniques.encirclement):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
c1_d, c2_d |
float | 2.0, 2.8284 | Position/velocity tracking gains |
r_max |
float | 50 | Neighbor sensing range (for relative position feedback) |
r_desired |
float | 5 | Encirclement radius (distance from target to agents) |
phi_dot_d |
float | 0.05 | Desired angular velocity [rad/s] - how fast the circle rotates |
ref_plane |
str | "horizontal" | Reference plane: "horizontal" (x-y) or "vertical" (x-z) |
quat_0_* |
float | 0.0 | Orientation quaternion components for 3D rotation of formation disc |
Use Cases: Circular surveillance, contained herding, orbital formation control
Module: pinning_lattice.py
Strategy Name: "pinning_lattice"
Purpose: Advanced flocking where agents form heterogeneous (variable-spacing) lattices that can be optimized via reinforcement learning. Supports multiple potential functions and topology-aware optimization.
Algorithm Components:
u = u_a(interaction) + u_b(obstacle) + u_g(navigation)
u_a = repulsion(method) + velocity_alignment
u_b = obstacle repulsion
u_g = target tracking
Potential Function Methods:
- default: Olfati-Saber sigmoid (smooth, tunable)
- lennard_jones: LJ = (1/r^12) - (1/r^6) (molecular dynamics inspired)
- morse: Exponential + Gaussian (biomimetic)
- gromacs_soft_core: Soft-core potential (integrable, smooth)
- mixed: Combination of above
Configuration Parameters (planner.techniques.pinning_lattice):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
hetero_lattice |
{0,1} | 1 | Heterogeneous lattice: allow variable spacing, negotiate via consensus |
learning |
{0,1} | 0 | Enable RL lattice optimization: agents learn optimal inter-agent distances |
learning_grid_size |
int | -1 | RL grid resolution (-1 = use 10×10 grid) |
flocking_method |
str | "default" | Potential function: "default" (Olfati-Saber), "lennard_jones", "morse", "gromacs_soft_core", "mixed" |
r_max |
float | 15 | Maximum sensing range |
d_min |
float | 5 | Minimum allowed spacing (collision avoidance lower bound) |
d |
float | 10 | Initial desired spacing (formation base unit) |
d_prime_ratio |
float | 0.6 | Obstacle distance ratio: d_prime = d_prime_ratio × d |
r_prime_ratio |
float | 1.3 | Obstacle detection ratio: r_prime = r_prime_ratio × d |
c1_a, c2_a |
float | 0.1, 0.2 | Interaction gains: position and velocity |
c1_b, c2_b |
float | 0, 0 | Obstacle avoidance gains |
c1_g, c2_g |
float | 0.2, 0.4472 | Navigation gains |
Consensus Mechanism (when hetero_lattice=1):
- Consensus module reaches agreement on inter-agent distances
- Each neighbor pair converges to a negotiated spacing
- Supports multi-agent consensus on lattice parameters
Learning (when learning=1):
- Q-learning agent optimizes lattice scale for each agent group
- Reward signal: maximize k-connectivity while minimizing energy
- Actions: increase/decrease preferred distance d
Use Cases: Self-assembling formations, optimized lattice assembly, topology-aware coordination
Module: shepherding.py
Strategy Name: "shepherding"
Purpose: Separates swarm into shepherds (guides) and herd (targets). Herd agents flock together while shepherds push them toward a goal using educated positioning.
Agent Roles:
- Shepherds (n_shepherds agents): Control and guide the herd toward target
- Herd (remaining agents): Follow flocking rules and respond to shepherds
Herd Behavior:
u_herd = u_repulsion + u_orientation + u_attraction + u_shepherd_response
Shepherd Behavior:
u_shepherd = u_nav(target) + u_repulsion(herd) + u_repulsion(shepherds) + u_obstacle
Shepherds position themselves on opposite side of herd from target
(pushing from behind to herd forward)
Configuration Parameters (planner.techniques.shepherding):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
nShepherds |
int | 5 | Number of shepherd agents (rest are herd) |
| Herd Parameters | |||
r_R |
float | 3 | Repulsion radius (separate from crowding herd members) |
r_O |
float | 5 | Orientation radius (align with nearby herd) |
r_A |
float | 7 | Attraction radius (cohere toward herd center) |
r_I |
float | 6.5 | Shepherd interaction radius (respond to nearby shepherds) |
a_R, a_O, a_A, a_I |
float | 2, 2, 2, 4 | Gains for repulsion, orientation, attraction, shepherd response |
a_V |
float | 2 | Laziness gain (desire to slow down/rest) |
| Shepherd Parameters | |||
r_S |
float | 5.5 | Desired radius from herd centroid (positioning distance) |
r_Oi |
float | 3 | Obstacle viewing range (for other shepherds) |
r_Od |
float | 2 | Desired clearance from obstacles |
r_Or |
float | 1 | Shepherd physical radius |
a_N |
float | 5 | Navigation/pushing gain (strength of target tracking) |
a_R_s, a_R_s_v |
float | 1, 2 | Shepherd-shepherd repulsion: position and velocity gains |
a_V_s |
float | 1.0 | Shepherd laziness gain |
type_shepherd |
str | "haver" | Positioning method: "haver" (haversine) - traditional approach |
type_avoid |
str | "ref_point" | Collision avoidance method: "ref_point" (recommended) or "ref_shepherd" |
cmd_adjust |
float | 0.02 | Command adjustment factor (typically ~0.02-0.05) |
Positioning Strategy:
- Shepherds compute herd centroid
- Position themselves opposite the target (pushing from behind)
- Maintain formation while pushing herd forward
Use Cases: Livestock herding, crowd control, goal-directed swarm manipulation
Module: flocking_starling.py
Strategy Name: "flocking_starling"
Purpose: Bio-inspired flocking based on European starling murmuration behavior. Models dynamic, topical interaction range and roosting behavior.
Key Feature - Topical Interaction:
- Each agent tries to maintain sight of exactly n_c nearest neighbors
- Interaction radius adapts dynamically if neighbors are too far/near
- Creates natural, organic-looking flocking patterns
Algorithm Components:
u = u_separation + u_cohesion + u_alignment + u_roost + u_random
Each component weighted by w_s, w_c, w_a, w_roost_h/v, w_rand
Configuration Parameters (planner.techniques.flocking_starling):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
| Speed & Dynamics | |||
v_o |
float | 10 | Cruise speed (desired flying speed) |
m |
float | 0.08 | Agent mass (affects inertia) |
tau |
float | 0.2 | Relaxation time (return to cruise speed timescale) |
del_u |
float | 0.1 | Reaction time (delay in detecting new neighbors) |
s |
float | 0.01 | Interpolation factor (adaptation speed for interaction radius) |
| Sensing & Interaction | |||
R_max |
float | 100 | Maximum interaction radius (hard limit) |
n_c |
float | 6.5 | Topical interaction count (maintain sight of n_c nearest neighbors) |
r_sep |
float | 10 | Separation radius (repel from closer neighbors) |
r_h |
float | 0.2 | Hard sphere radius (ignore collisions below this) |
| Roosting (Target Attraction) | |||
r_roost |
float | 50 | Roosting zone radius (attraction zone around target) |
w_roost_h, w_roost_v |
float | 0.2, 0.1 | Roosting weights: horizontal (x-y) and vertical (z) components |
C_c |
float | 0.35 | Centrality threshold (interior/exterior classification for roosting) |
| Behavioral Weights | |||
w_s, w_c, w_a |
float | 1, 0.7, 0.2 | Weights for separation, cohesion, alignment |
w_rand |
float | 0.05 | Random disturbance weight (adds natural variability) |
| Shape & Smoothness | |||
alpha |
float | 0.5 | Tightness parameter (0=loose swarm, 1=tight formation) |
sigma_param |
float | 4.60517 | Gaussian shape parameter (separation force smoothness) |
eps |
float | 1e-5 | Regularization constant (prevent divide-by-zero) |
Unique Features:
- Agents track interior/exterior position relative to swarm center
- Roosting behavior: agents attracted to target when in roost zone
- Natural swarm cohesion without explicit distance targets
- Emergent murmurating patterns
Use Cases: Bio-inspired swarms, organic flocking, dynamic coordination without fixed formations
Module: malicious_agent.py
Strategy Name: "malicious_agent"
Purpose: Flocking algorithm that detects and mitigates malicious agents attempting to disrupt swarm cohesion. Uses adaptive control and parameter estimation.
Algorithm Structure (3 layers):
Layer 1: Nominal flocking (consensus on velocity)
Layer 2: Cooperative/adversarial detection
Layer 3: Adaptive gain adjustment to maintain connectivity
Configuration Parameters (planner.techniques.malicious_agent):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
| Formation Parameters | |||
d |
float | 5 | Desired separation distance |
r |
float | 7.07 | Sensing range |
gain_p, gain_v |
float | 1, 0 | Navigation gains (typically low for this method) |
| Layer 1: Flocking Gains | |||
kv |
float | 3 | Velocity (consensus) gain |
ka |
float | 1 | Alignment gain |
kr |
float | 2 | Repulsion gain |
| Layer 2: Malice Counter-Control | |||
kx |
float | 2 | Layer 2 gain (for malice detection/countering) |
d_bar |
float | 3.536 | Malicious agent separation (typically d/2) |
i_cont |
float | 0.2 | Integrating constant (potential function parameter) |
| Layer 3: Adaptive Connectivity | |||
gamma_kp |
float | 2 | Gamma proportional constant (gain adaptation rate) |
H_min |
float | 100 | Minimum H threshold (swarm robustness lower bound) |
| Malicious Agent Mode | |||
mode_malicious |
{0,1} | 1 | Enable malice detection (1=yes, 0=no) |
mal_type |
str | "collider" | Attack type: "runaway" (escape), "collider" (crash), "cooperative" (mimics normal) |
filter_v_gain |
float | 50 | State estimator filter gain (for parameter estimation) |
cmd_min, cmd_max |
float | -100, 100 | Command saturation limits |
Attack Types:
- Runaway: Malicious agent tries to escape (detected by unusual repulsion)
- Collider: Crashes into swarm (detected by collision attempts)
- Cooperative: Mimics normal behavior but part of coordinated attack
Detection Mechanism:
- Monitors agent velocities and relative positions
- Estimates malicious agent parameters adaptively
- Computes H-metric (connectivity measure)
- Adjusts gains when swarm connectivity threatened
Use Cases: Robust autonomous swarms, adversarial scenarios, resilience testing
Location: orchestrator.py
The orchestrator is the master controller that:
- Initializes all system components (agents, targets, obstacles, planner)
- Updates sensing/connection graphs based on agent positions
- Selects pinning agents (if applicable) for enhanced control
- Computes control commands for each agent with the selected planner
- Updates learning modules (if enabled)
| Parameter | Type | Default | Meaning |
|---|---|---|---|
pin_update_rate |
int | 5 | Update frequency [timesteps] for re-selecting pinned agents |
pin_selection_method |
str | "degree" | Method to select pin agents: |
- "degree": highest degree centrality (most connected) |
|||
- "degree_leafs": degree + include leaves (isolated nodes) |
|||
- "gramian": [future] controllability gramian-based |
|||
- "between": [future] betweenness centrality |
|||
- "nopins": no pinning (fully decentralized) |
|||
- "allpins": all agents are pinned |
|||
criteria_table.radius |
bool | true | Graph construction criterion: use Euclidean distance radius |
criteria_table.aperature |
bool | false | Graph construction criterion: use field-of-view aperture angle |
sensor_aperature |
float | 140 | Field-of-view angle [degrees] (if aperature criterion used) |
learning_ctrl |
str | null | Global learning controller: null (disabled) or "CALA" |
connectivity_slack |
float | 1 | Relaxation parameter for connectivity computations |
sensor_range_matrix: W-adjacency based on sensing range (symmetric)interaction_graph: Who agents sense (for flocking)connection_range_matrix: Who agents are connected to (lattice)connection_graph: Topology for connectivity analysis
Some planners (especially pinning_lattice) use pinned agents - agents whose positions are fixed or strongly controlled to stabilize the entire formation. The orchestrator:
- Recomputes pinning assignments every
pin_update_ratetimesteps - Selects best candidates using
pin_selection_method - Passes pin assignments to planner
Location: config/config.json → simulation section
| Parameter | Type | Default | Meaning |
|---|---|---|---|
Ti |
float | 0 | Initial time [seconds] |
Tf |
float | 30 | Final time [seconds] |
Ts |
float | 0.02 | Timestep [seconds] (50 Hz simulation rate) |
dimens |
{2, 3} | 2 | Simulation dimensionality: 2D or 3D |
verbose |
{0, 1, 2} | 1 | Output verbosity: 0 (silent), 1 (normal), 2 (debug) |
system |
str | "swarm" | System type: currently only "swarm" supported |
strategy |
str | "shepherding" | Swarming strategy (planner technique to use) |
random_seed |
int | 42 | Random number seed (for reproducibility) |
f |
int | 0 | [Deprecated] Legacy parameter |
experimental_save |
bool | false | Save to experiments folder (consolidates data, plots, configs) |
obstacle_avoidance_strategy |
str | "flocking_saber" | Strategy for obstacle avoidance in navigation |
Set strategy to one of:
"flocking_saber"- Olfati-Saber flocking with obstacles"flocking_reynolds"- Reynolds boids flocking"lemniscates"- Figure-8 trajectories"encirclement"- Circular formation rotation"pinning_lattice"- Heterogeneous lattice with learning"shepherding"- Shepherd-herd dynamics"flocking_starling"- Bio-inspired murmurations"malicious_agent"- Robust flocking with adversarial agents
Location: learner/
The simulator includes several adaptive learning strategies that allow agents to optimize their behavior during simulation.
learner/conductor.py initializes and coordinates learning modules:
def initialize(Agents, tactic_type, learning_ctrl, Ts, config):
Learners = {}
# Initialize appropriate learners based on strategy
return LearnersModule: learner/CALA_control.py
Gradient-free, distributed learning algorithm for multi-agent parameter optimization.
Parameters (learner.CALA section):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
| Actions | |||
actions_range |
str | "angular" | Action space: "angular" (angles/directions) or "linear" |
action_min, action_max |
float | -0.785, 0.785 | Action bounds [radians] (±45°) |
| Learning Dynamics | |||
learning_rate |
float | 0.5 | Parameter step size (how much to adjust per update) |
variance_init |
float | 0.4 | Initial exploration variance |
variance_ratio |
float | 0.5 | Variance decay ratio (annealing factor) |
variance_min, variance_max |
float | 0.0001, 10 | Variance bounds |
epsilon |
float | 1e-6 | Regularization constant |
| Update Frequency | |||
counter_max |
int | 100 | Update threshold (compute gradient after 100 samples) |
counter_synch |
bool | true | Synchronize updates across agents |
counter_delay |
int | 500 | Delay before update [timesteps] (let system stabilize) |
| Exploration | |||
explore_dirs |
bool | true | Explore random directions (drift + gradient) |
explore_persistence |
float | 0.7 | Exploration persistence (0=random, 1=follow direction) |
| Multi-Agent Coordination | |||
leader_follower |
bool | true | Use leader-follower structure (coordinator + agents) |
leader |
int | 0 | Leader agent index |
| Reward Design | |||
reward_mode |
str | "target" | Reward source: "target" (distance to goal) or "swarm" (cohesion) |
reward_coupling |
float | 2 | Reward coupling strength (how much agents influence each other) |
reward_reference |
str | "global" | Reference frame: "global" (fixed) or "relative" (moving with swarm) |
reward_form |
str | "sharp" | Reward function shape: "sharp" (discontinuous) or "smooth" (continuous) |
reward_k_theta |
float | 12.0 | Reward steepness parameter |
| Advanced Options | |||
momentum |
bool | false | Enable momentum (accumulate direction) |
momentum_beta |
float | 0.8 | Momentum decay coefficient |
annealing |
bool | false | Variance annealing (gradual decrease) |
annealing_rate |
float | 0.99 | Annealing decay per cycle |
kicking |
bool | false | Kicking to escape local optima |
kicking_factor |
float | 1.3 | Kick magnitude factor |
sigmoidize |
bool | false | Apply sigmoid to actions (smooth clipping) |
Use Cases:
- Learning lemniscate trajectory shape (with
"lemniscates"strategy) - Optimizing flocking parameters online
- Global learning controller (
orchestrator.learning_ctrl = "CALA")
Module: learner/QL_learning_lattice.py
Reinforcement learning module that optimizes desired inter-agent distances for pinning lattice control.
How It Works:
- Discretizes distance space into grid (default 10×10)
- Each agent learns optimal distance for its local neighborhood
- Q-values trained from reward signal (connectivity vs. energy)
- Works in conjunction with consensus lattice mechanism
Integration:
- Enabled when:
planner.techniques.pinning_lattice.learning = 1 - Requires:
hetero_lattice = 1(consensus mechanism)
Module: learner/consensus_lattice.py
Cooperative agreement mechanism for heterogeneous lattice formation. Agents negotiate inter-agent distances via consensus algorithm.
Algorithm:
d_i(t+1) = d_i(t) + α Σ_j (d_j(t) - d_i(t))
^ local distance ^ agreement with neighbors
Integration:
- Automatically enabled when:
pinning_lattice.hetero_lattice = 1 - Agents in same neighborhood converge to shared desired distance
- Enables topology-aware formation assembly
Location: obstacles/obstacles.py
Obstacles are static or moving barriers that agents must avoid.
Configuration (obstacles section):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
nObs |
int | 1 | Number of obstacles |
vehObs |
{0, 1} | 0 | Include agents as obstacles: 1 = agents repel each other in addition to targets |
oSpread |
float | 20 | Random spread radius [distance units] around target (if not manual) |
manual |
bool | true | Manual placement: true = use manual_positions, false = random |
| Manual Positioning | |||
manual_positions.x, .y, .z |
float | -2.4, 1.2, 25 | Obstacle center coordinates |
manual_positions.radius |
float | 1 | Obstacle radius (sphere size) |
Obstacle Structure (internal):
obstacles = [x0..xn, # positions
y0..yn,
z0..zn,
r0..rn] # radii
Note: Reynolds flocking automatically treats the target as an obstacle (repulsive zone).
Location: targets/targets.py
Goals that agents navigate toward or swarm around.
Configuration (targets section):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
tSpeed |
float | 0 | Target velocity [m/s] (moving target speed) |
initial_position |
[x, y, z] | [0, 0, 15] | Initial target location |
Example Target Trajectories (implemented in targets.py):
# Sinusoidal motion (current)
targets.x = 100·sin(tSpeed·t)
targets.y = 100·sin(tSpeed·t)·cos(tSpeed·t)
targets.z = 100·sin(tSpeed·t)·sin(tSpeed·t) + 15
# Can be modified for circular, spiral, or custom pathsLocation: data/data_manager.py
Handles data recording and I/O.
Configuration (data section):
| Parameter | Type | Default | Meaning |
|---|---|---|---|
save_data |
bool | true | Save simulation results |
data_dir |
str | "data/data/" | Output directory |
data_file |
str | "data.h5" | Output filename (HDF5 format) |
record_interval |
int | 1 | Record every N timesteps (1 = every timestep) |
Recorded Data (History object):
t_all # Time vector
states_all # Agent positions and velocities over time
cmds_all # Control commands issued
targets_all # Target positions
obstacles_all # Obstacle positions
centroid_all # Swarm center of mass
f_all # Fitness/reward values (if learning enabled)Location: visualization/
Two main visualization tools:
-
Animation (
visualization/animation_sim.py)- Generates 2D/3D animations of swarm behavior
- Shows agent positions, velocities, interactions
- Supports various agent shapes and colors
-
Plotting (
visualization/plot_sim.py)- Generates publication-quality plots
- Position trajectories, velocity profiles
- Swarm metrics (dispersion, centroid, energy)
Output Folders:
visualization/animations/→ animated GIFsvisualization/plots/→ static plots (PNG, PDF)visualization/public/→ example outputs organized by technique
Location: utils/swarmgraph.py
Graph representation and analysis tools for swarm topology.
Functions:
build_graph()- Create adjacency matrix from agent positionsgraph_metrics()- Compute connectivity, centrality measuresis_connected()- Check if swarm is fully connectedupdate_edges()- Dynamic graph updates as agents move
Location: utils/modeller.py
Trajectory and state estimation functions.
-
Configure your simulation in
config/config.json:{ "simulation": { "strategy": "shepherding", "Tf": 30, "dimens": 2 }, "agents": { "nAgents": 20, "dynamics": "double integrator" } } -
Run the simulation:
python main.py
-
Visualize results:
# Animations and plots auto-generated if enabled
{
"simulation": {"strategy": "flocking_reynolds", "Tf": 50},
"agents": {"nAgents": 30, "rAgents": 0.3},
"orchestrator": {"pin_selection_method": "nopins"}
}{
"simulation": {"strategy": "lemniscates", "Tf": 60},
"planner": {
"techniques": {
"lemniscates": {
"lemni_type": 0,
"learning": "CALA",
"learning_axes": "xz"
}
}
},
"learner": {"CALA": {"learning_rate": 0.5}}
}{
"simulation": {"strategy": "pinning_lattice", "Tf": 60},
"planner": {
"techniques": {
"pinning_lattice": {
"hetero_lattice": 1,
"learning": 1,
"flocking_method": "lennard_jones"
}
}
}
}{
"simulation": {"strategy": "malicious_agent"},
"planner": {
"techniques": {
"malicious_agent": {
"mode_malicious": 1,
"mal_type": "collider"
}
}
}
}┌─────────────────────────────────────────────────────────┐
│ main.py │
│ 1. Load config/config.json │
│ 2. Initialize orchestrator.build_system() │
│ 3. Run main simulation loop │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ orchestrator.Controller │
│ • Maintains agent states & targets │
│ • Computes sensing/connection graphs │
│ • Selects pinned agents (if applicable) │
│ • Calls planner.compute_cmd() for each agent │
│ • Updates learning modules │
└─────────────────────────────────────────────────────────┘
↓
┌───────────────┼───────────────┐
↓ ↓ ↓
┌─────────┐ ┌──────────┐ ┌──────────────┐
│ Agents │ │ Planner │ │ Learner │
│ (state) │ │ (command)│ │ (optimize) │
└─────────┘ └──────────┘ └──────────────┘
↓ ↓ ↓
┌─────────────────────────────────────────────┐
│ data/data_manager.py │
│ Records states, commands, metrics │
└─────────────────────────────────────────────┘
↓
┌────────────────────────────────────────────┐
│ visualization/ │
│ Generates animations and plots │
└────────────────────────────────────────────┘
- Most parameters are tuned through empirical testing; start with defaults and adjust gains (c1, c2, etc.) to see effects
- Learning modules are in development; expect updates and refinements
- Quadcopter dynamics significantly increase computation; use double integrator for prototyping
- Graph construction criteria (radius vs. aperture) affect neighbor detection; radius is most common
- See
docs/devnotes.mdfor recent changes and known issues
Last Updated: March 2026
Maintained By: Claude Haiku 4.5