Skip to content

Latest commit

 

History

History
992 lines (772 loc) · 39.8 KB

File metadata and controls

992 lines (772 loc) · 39.8 KB

Multi-Agent Swarm Simulator - Documentation

Note: This documentation auto-generated by Claude Haiku 4.5 (Agent) on 2026-03-14. Use with caution.

Project: Multi-Agent Coordination Simulator
Version: 2.0
Author: P.T. Jardine, PhD
Description: An open architecture multi-agent simulator for use by academic researchers.


Table of Contents

  1. Project Overview
  2. Core Architecture
  3. Configuration System
  4. Agents Module
  5. Planner Techniques
  6. Orchestrator
  7. Simulation Parameters
  8. Learning Modules
  9. Obstacles & Targets
  10. Visualization & Data
  11. Utilities & Graph Tools
  12. Running Simulations

Project Overview

This simulator implements decentralized, asynchronous multi-agent coordination strategies based on flocking, lattice formation, and other swarm behaviors. Key principles:

  • Decentralized: No global controller; each agent decides independently
  • Asynchronous: Agents update at different times based on local information
  • Local Information Only: Each agent only knows about nearby neighbors within sensing range

Supported Agent Dynamics

  • Double Integrator: Simplified model (position & velocity)
  • Quadcopter: Full quadrotor aerodynamics with nested control loops (velocity, attitude, angular rate)

Key Features

  • 2D and 3D simulations
  • Various agent shapes and visual representations
  • Multiple swarming techniques (8 implemented)
  • Heterogeneous lattice formation with learning
  • Obstacle avoidance and wall collision detection
  • Malicious agent detection and mitigation
  • Reinforcement learning integration (CALA, Q-learning)

Core Architecture

The simulator follows a layered architecture:

┌──────────────────────────────────────┐
│       Visualization & Plotting       │  (visualization/)
├──────────────────────────────────────┤
│     Data Manager & Experiment        │  (data/, experiments/)
│         Orchestration                │
├──────────────────────────────────────┤
│   Learning Modules (CALA, Q-learning)│  (learner/)
├──────────────────────────────────────┤
│   Orchestrator (Master Controller)   │  (orchestrator.py)
├──────────────────────────────────────┤
│        Planner Techniques            │  (planner/techniques/)
├──────────────────────────────────────┤
│  Agents | Targets | Obstacles | Graph│  (agents/, targets/, obstacles/, utils/)
├──────────────────────────────────────┤
│         Configuration Layer          │  (config/config.json)
└──────────────────────────────────────┘

Execution Flow

  1. Initialization (main.py): Load configuration, build system components
  2. Simulation Loop: Run for Ti to Tf seconds with timestep Ts
    • Update agent positions and velocities
    • Compute sensing/connection graphs
    • Plan trajectories based on strategy
    • Compute control commands
    • Update learning modules (if enabled)
    • Record data
  3. Finalization: Save data, generate visualizations

Configuration System

All simulation parameters are centralized in config/config.json. The configuration is organized into logical sections:

Structure

{
  "project": {...},           // Metadata
  "simulation": {...},        // Overall simulation parameters
  "agents": {...},            // Agent properties
  "targets": {...},           // Target definitions
  "obstacles": {...},         // Obstacle configurations
  "orchestrator": {...},      // Master controller settings
  "planner": {...},           // Planner-specific parameters
  "learner": {...},           // Learning module parameters
  "data": {...}               // Data recording options
}

Agents Module

Location: agents/

Agent Types

Agents are initialized from agents.py and can use one of two dynamics models.

Agent Class

class Agents:
    - .state          # [x0..xn, y0..yn, z0..zn, vx0..vxn, vy0..vyn, vz0..vzn]
    - .nAgents        # Number of agents
    - .rAgents        # Physical radius of each agent
    - .centroid       # Center of mass of the swarm
    - .vmax / .vmin   # Velocity constraints

Double Integrator Dynamics

Configuration Parameter: agents.dynamics = "double integrator"

Equations of Motion:

ẋ = v  (velocity is state)
v̇ = u  (acceleration is control input)

Parameters:

  • nAgents: Number of agents in the swarm
  • rAgents: Physical radius of agents (collision avoidance buffer)
  • iSpread: Initial spread distance - agents randomly distributed within ±iSpread
  • init_conditions: How to initialize agent positions
    • "random": Uniformly random within iSpread
    • "mesh": Regular grid arrangement
    • "evenly_spaced": Manually specified positions
  • vmax / vmin: Velocity magnitude constraints (±5 m/s typical)

Quadcopter Dynamics

Configuration Parameter: agents.dynamics = "quadcopter"

Full 6-DOF nonlinear quadrotor model with:

  • Thrust from 4 rotors
  • 3 nested control loops:
    1. Velocity Control: Converts desired translational acceleration → attitude command
    2. Attitude Control: Converts attitude error → angular velocity command
    3. Angular Velocity Control: Converts angular velocity error → motor commands

Additional Quadcopter Config (agents/quadcopter_module/config.py):

  • Mass, inertia matrix
  • Motor parameters and limits
  • Control gains for each loop
  • Rotor speed bounds

Planner Techniques

Location: planner/techniques/

All planners inherit from BasePlanner and implement a common interface:

def compute_cmd(self, states, targets, index, **kwargs):
    """Compute control command for agent `index`"""
    return u  # acceleration/command vector

1. Flocking Saber (Olfati-Saber Algorithm)

Module: flocking_saber.py
Strategy Name: "flocking_saber"

Purpose: Distributed flocking with collision avoidance. Agents reach consensus on velocity while maintaining desired inter-agent distance.

Algorithm Components:

u = u_int + u_nav + u_obs
  u_int = agent-agent interaction (repulsion + velocity alignment)
  u_nav = navigation toward target
  u_obs = obstacle/wall avoidance

Configuration Parameters (planner.techniques.flocking_saber):

Parameter Type Default Meaning
a, b float 5.0 Uneven sigmoid parameters for smooth potential functions
eps float 0.1 Regularization constant (prevents singularities in smooth norms)
h float 0.2 Bump function parameter (smooth transition zone width)
c1_a, c2_a float 1.0, 2.0 Agent-agent interaction gains: position and velocity coupling coefficients
c1_b, c2_b float 0.0 Obstacle avoidance gains: typically zero if no obstacles
c1_g, c2_g float 2.0, 4.472 Navigation (goal) gains: strength of target tracking
d float 10.0 Desired inter-agent distance (formation spacing)
d_prime float 6.0 Target-obstacle distance (safety margin from obstacles)
r float 13.0 Sensing range (communication/detection radius for neighbors)
r_prime float 7.8 Obstacle sensing range

Use Cases: General consensus, formation flying, lattice assembly


2. Flocking Reynolds (Boids)

Module: flocking_reynolds.py
Strategy Name: "flocking_reynolds"

Purpose: Classic Reynolds flocking with three rules: separation (avoid crowding), alignment (match velocity), cohesion (stay together). Simpler than Olfati-Saber.

Algorithm Components:

u = Σ [w_sep·u_sep + w_align·u_align + w_coh·u_coh] + u_target

Configuration Parameters (planner.techniques.flocking_reynolds):

Parameter Type Default Meaning
escort {0,1} 1 Target tracking mode: 1 = track moving target, 0 = follow swarm centroid
cd_1 float 0.3 Cohesion weight (move toward centroid of neighbors)
cd_2 float 0.4 Alignment weight (match neighbor velocities)
cd_3 float 0.2 Separation weight (repel from crowded neighbors)
cd_track float 0.2 Target tracking weight (only used if escort=1)
maxu float 10 Max acceleration magnitude per rule
maxv float 100 Max velocity magnitude
recovery {0,1} 0 Auto-recovery: trigger if swarm disperses beyond far_away
far_away float 300 Dispersal threshold (triggers recovery if exceeded)
mode_min_coh {0,1} 0 Enforce minimum cohesion: require at least agents_min_coh neighbors in sight
agents_min_coh int 2 Minimum cohesion group size
r float 10 Neighbor sensing range
r_prime float 5 Separation/collision avoidance range

Use Cases: Schooling behavior, swarm aggregation, simple coordinated movement


3. Lemniscates (Figure-8 Trajectories)

Module: lemniscates.py
Strategy Name: "lemniscates"

Purpose: Agents follow lemniscate (figure-8) or Gerono curves around a target, creating dynamic, flowing patterns. Can be combined with reinforcement learning to optimize circular path radius.

Algorithm Components:

u = -c1_d·(p - p_desired) - c2_d·v
    where p_desired follows lemniscate curve around target

Lemniscate Types:

  • Type 0: Gerono (surveillance) - rotates smoothly around target
  • Type 1: Gerono (rolling) - rolling motion pattern
  • Type 2: Gerono (mobbing) - enclosing pattern with vertical offset
  • Type 3-5: Explicit curves (Dumbbell, Bernoulli) for diversity

Configuration Parameters (planner.techniques.lemniscates):

Parameter Type Default Meaning
c1_d, c2_d float 1.0, 2.0 Tracking gains: position and velocity coupling for trajectory tracking
lemni_type {0-5} 0 Lemniscate shape: 0=Gerono surveillance, 1=rolling, 2=mobbing, 3-5=other curves
learning str null Learning method: null (disabled) or "CALA" (Collaborative Adaptive Learning Algorithm)
learning_axes str "xz" Learning dimensions: "x" (sagittal), "z" (vertical), or "xz" (coupled)
learning_coupling bool true Coupled learning: if true, x and z parameters are linked (recommended for xz)

Learning Integration (CALA):

  • Agents adaptively adjust the lemniscate size during simulation
  • Reward signal: agreement with neighbors on path parameters
  • Only works with lemni_type=0 (Gerono surveillance)

Use Cases: Surveillance patterns, dynamic swarm choreography, continuous circular coverage


4. Encirclement

Module: encirclement.py
Strategy Name: "encirclement"

Purpose: Forms agents into a perfect circle around a target and rotates them together with controlled angular velocity. Maintains even spacing on the circle perimeter.

Algorithm Components:

u = -c1_d·(p - p_circle) - c2_d·(v - v_circle)
    where p_circle is the agent's assigned position on the circle
          v_circle is the required velocity for circular motion

Formation Maintenance:

  • Agents assigned angular positions around target
  • Leading/lagging pairs maintain relative spacing
  • Angular velocity synchronized across swarm

Configuration Parameters (planner.techniques.encirclement):

Parameter Type Default Meaning
c1_d, c2_d float 2.0, 2.8284 Position/velocity tracking gains
r_max float 50 Neighbor sensing range (for relative position feedback)
r_desired float 5 Encirclement radius (distance from target to agents)
phi_dot_d float 0.05 Desired angular velocity [rad/s] - how fast the circle rotates
ref_plane str "horizontal" Reference plane: "horizontal" (x-y) or "vertical" (x-z)
quat_0_* float 0.0 Orientation quaternion components for 3D rotation of formation disc

Use Cases: Circular surveillance, contained herding, orbital formation control


5. Pinning Lattice (Optimized Formation Control)

Module: pinning_lattice.py
Strategy Name: "pinning_lattice"

Purpose: Advanced flocking where agents form heterogeneous (variable-spacing) lattices that can be optimized via reinforcement learning. Supports multiple potential functions and topology-aware optimization.

Algorithm Components:

u = u_a(interaction) + u_b(obstacle) + u_g(navigation)
  u_a = repulsion(method) + velocity_alignment
  u_b = obstacle repulsion
  u_g = target tracking

Potential Function Methods:

  • default: Olfati-Saber sigmoid (smooth, tunable)
  • lennard_jones: LJ = (1/r^12) - (1/r^6) (molecular dynamics inspired)
  • morse: Exponential + Gaussian (biomimetic)
  • gromacs_soft_core: Soft-core potential (integrable, smooth)
  • mixed: Combination of above

Configuration Parameters (planner.techniques.pinning_lattice):

Parameter Type Default Meaning
hetero_lattice {0,1} 1 Heterogeneous lattice: allow variable spacing, negotiate via consensus
learning {0,1} 0 Enable RL lattice optimization: agents learn optimal inter-agent distances
learning_grid_size int -1 RL grid resolution (-1 = use 10×10 grid)
flocking_method str "default" Potential function: "default" (Olfati-Saber), "lennard_jones", "morse", "gromacs_soft_core", "mixed"
r_max float 15 Maximum sensing range
d_min float 5 Minimum allowed spacing (collision avoidance lower bound)
d float 10 Initial desired spacing (formation base unit)
d_prime_ratio float 0.6 Obstacle distance ratio: d_prime = d_prime_ratio × d
r_prime_ratio float 1.3 Obstacle detection ratio: r_prime = r_prime_ratio × d
c1_a, c2_a float 0.1, 0.2 Interaction gains: position and velocity
c1_b, c2_b float 0, 0 Obstacle avoidance gains
c1_g, c2_g float 0.2, 0.4472 Navigation gains

Consensus Mechanism (when hetero_lattice=1):

  • Consensus module reaches agreement on inter-agent distances
  • Each neighbor pair converges to a negotiated spacing
  • Supports multi-agent consensus on lattice parameters

Learning (when learning=1):

  • Q-learning agent optimizes lattice scale for each agent group
  • Reward signal: maximize k-connectivity while minimizing energy
  • Actions: increase/decrease preferred distance d

Use Cases: Self-assembling formations, optimized lattice assembly, topology-aware coordination


6. Shepherding

Module: shepherding.py
Strategy Name: "shepherding"

Purpose: Separates swarm into shepherds (guides) and herd (targets). Herd agents flock together while shepherds push them toward a goal using educated positioning.

Agent Roles:

  • Shepherds (n_shepherds agents): Control and guide the herd toward target
  • Herd (remaining agents): Follow flocking rules and respond to shepherds

Herd Behavior:

u_herd = u_repulsion + u_orientation + u_attraction + u_shepherd_response

Shepherd Behavior:

u_shepherd = u_nav(target) + u_repulsion(herd) + u_repulsion(shepherds) + u_obstacle
  Shepherds position themselves on opposite side of herd from target
  (pushing from behind to herd forward)

Configuration Parameters (planner.techniques.shepherding):

Parameter Type Default Meaning
nShepherds int 5 Number of shepherd agents (rest are herd)
Herd Parameters
r_R float 3 Repulsion radius (separate from crowding herd members)
r_O float 5 Orientation radius (align with nearby herd)
r_A float 7 Attraction radius (cohere toward herd center)
r_I float 6.5 Shepherd interaction radius (respond to nearby shepherds)
a_R, a_O, a_A, a_I float 2, 2, 2, 4 Gains for repulsion, orientation, attraction, shepherd response
a_V float 2 Laziness gain (desire to slow down/rest)
Shepherd Parameters
r_S float 5.5 Desired radius from herd centroid (positioning distance)
r_Oi float 3 Obstacle viewing range (for other shepherds)
r_Od float 2 Desired clearance from obstacles
r_Or float 1 Shepherd physical radius
a_N float 5 Navigation/pushing gain (strength of target tracking)
a_R_s, a_R_s_v float 1, 2 Shepherd-shepherd repulsion: position and velocity gains
a_V_s float 1.0 Shepherd laziness gain
type_shepherd str "haver" Positioning method: "haver" (haversine) - traditional approach
type_avoid str "ref_point" Collision avoidance method: "ref_point" (recommended) or "ref_shepherd"
cmd_adjust float 0.02 Command adjustment factor (typically ~0.02-0.05)

Positioning Strategy:

  • Shepherds compute herd centroid
  • Position themselves opposite the target (pushing from behind)
  • Maintain formation while pushing herd forward

Use Cases: Livestock herding, crowd control, goal-directed swarm manipulation


7. Flocking Starling (Biologically-Inspired Murmurations)

Module: flocking_starling.py
Strategy Name: "flocking_starling"

Purpose: Bio-inspired flocking based on European starling murmuration behavior. Models dynamic, topical interaction range and roosting behavior.

Key Feature - Topical Interaction:

  • Each agent tries to maintain sight of exactly n_c nearest neighbors
  • Interaction radius adapts dynamically if neighbors are too far/near
  • Creates natural, organic-looking flocking patterns

Algorithm Components:

u = u_separation + u_cohesion + u_alignment + u_roost + u_random
  Each component weighted by w_s, w_c, w_a, w_roost_h/v, w_rand

Configuration Parameters (planner.techniques.flocking_starling):

Parameter Type Default Meaning
Speed & Dynamics
v_o float 10 Cruise speed (desired flying speed)
m float 0.08 Agent mass (affects inertia)
tau float 0.2 Relaxation time (return to cruise speed timescale)
del_u float 0.1 Reaction time (delay in detecting new neighbors)
s float 0.01 Interpolation factor (adaptation speed for interaction radius)
Sensing & Interaction
R_max float 100 Maximum interaction radius (hard limit)
n_c float 6.5 Topical interaction count (maintain sight of n_c nearest neighbors)
r_sep float 10 Separation radius (repel from closer neighbors)
r_h float 0.2 Hard sphere radius (ignore collisions below this)
Roosting (Target Attraction)
r_roost float 50 Roosting zone radius (attraction zone around target)
w_roost_h, w_roost_v float 0.2, 0.1 Roosting weights: horizontal (x-y) and vertical (z) components
C_c float 0.35 Centrality threshold (interior/exterior classification for roosting)
Behavioral Weights
w_s, w_c, w_a float 1, 0.7, 0.2 Weights for separation, cohesion, alignment
w_rand float 0.05 Random disturbance weight (adds natural variability)
Shape & Smoothness
alpha float 0.5 Tightness parameter (0=loose swarm, 1=tight formation)
sigma_param float 4.60517 Gaussian shape parameter (separation force smoothness)
eps float 1e-5 Regularization constant (prevent divide-by-zero)

Unique Features:

  • Agents track interior/exterior position relative to swarm center
  • Roosting behavior: agents attracted to target when in roost zone
  • Natural swarm cohesion without explicit distance targets
  • Emergent murmurating patterns

Use Cases: Bio-inspired swarms, organic flocking, dynamic coordination without fixed formations


8. Malicious Agent (Robust Flocking with Adversarial Detection)

Module: malicious_agent.py
Strategy Name: "malicious_agent"

Purpose: Flocking algorithm that detects and mitigates malicious agents attempting to disrupt swarm cohesion. Uses adaptive control and parameter estimation.

Algorithm Structure (3 layers):

Layer 1: Nominal flocking (consensus on velocity)
Layer 2: Cooperative/adversarial detection
Layer 3: Adaptive gain adjustment to maintain connectivity

Configuration Parameters (planner.techniques.malicious_agent):

Parameter Type Default Meaning
Formation Parameters
d float 5 Desired separation distance
r float 7.07 Sensing range
gain_p, gain_v float 1, 0 Navigation gains (typically low for this method)
Layer 1: Flocking Gains
kv float 3 Velocity (consensus) gain
ka float 1 Alignment gain
kr float 2 Repulsion gain
Layer 2: Malice Counter-Control
kx float 2 Layer 2 gain (for malice detection/countering)
d_bar float 3.536 Malicious agent separation (typically d/2)
i_cont float 0.2 Integrating constant (potential function parameter)
Layer 3: Adaptive Connectivity
gamma_kp float 2 Gamma proportional constant (gain adaptation rate)
H_min float 100 Minimum H threshold (swarm robustness lower bound)
Malicious Agent Mode
mode_malicious {0,1} 1 Enable malice detection (1=yes, 0=no)
mal_type str "collider" Attack type: "runaway" (escape), "collider" (crash), "cooperative" (mimics normal)
filter_v_gain float 50 State estimator filter gain (for parameter estimation)
cmd_min, cmd_max float -100, 100 Command saturation limits

Attack Types:

  • Runaway: Malicious agent tries to escape (detected by unusual repulsion)
  • Collider: Crashes into swarm (detected by collision attempts)
  • Cooperative: Mimics normal behavior but part of coordinated attack

Detection Mechanism:

  • Monitors agent velocities and relative positions
  • Estimates malicious agent parameters adaptively
  • Computes H-metric (connectivity measure)
  • Adjusts gains when swarm connectivity threatened

Use Cases: Robust autonomous swarms, adversarial scenarios, resilience testing


Orchestrator

Location: orchestrator.py

The orchestrator is the master controller that:

  1. Initializes all system components (agents, targets, obstacles, planner)
  2. Updates sensing/connection graphs based on agent positions
  3. Selects pinning agents (if applicable) for enhanced control
  4. Computes control commands for each agent with the selected planner
  5. Updates learning modules (if enabled)

Key Parameters (orchestrator section in config.json)

Parameter Type Default Meaning
pin_update_rate int 5 Update frequency [timesteps] for re-selecting pinned agents
pin_selection_method str "degree" Method to select pin agents:
- "degree": highest degree centrality (most connected)
- "degree_leafs": degree + include leaves (isolated nodes)
- "gramian": [future] controllability gramian-based
- "between": [future] betweenness centrality
- "nopins": no pinning (fully decentralized)
- "allpins": all agents are pinned
criteria_table.radius bool true Graph construction criterion: use Euclidean distance radius
criteria_table.aperature bool false Graph construction criterion: use field-of-view aperture angle
sensor_aperature float 140 Field-of-view angle [degrees] (if aperature criterion used)
learning_ctrl str null Global learning controller: null (disabled) or "CALA"
connectivity_slack float 1 Relaxation parameter for connectivity computations

Graph Types Maintained

  • sensor_range_matrix: W-adjacency based on sensing range (symmetric)
  • interaction_graph: Who agents sense (for flocking)
  • connection_range_matrix: Who agents are connected to (lattice)
  • connection_graph: Topology for connectivity analysis

Pinning Mechanism

Some planners (especially pinning_lattice) use pinned agents - agents whose positions are fixed or strongly controlled to stabilize the entire formation. The orchestrator:

  1. Recomputes pinning assignments every pin_update_rate timesteps
  2. Selects best candidates using pin_selection_method
  3. Passes pin assignments to planner

Simulation Parameters

Location: config/config.jsonsimulation section

Parameter Type Default Meaning
Ti float 0 Initial time [seconds]
Tf float 30 Final time [seconds]
Ts float 0.02 Timestep [seconds] (50 Hz simulation rate)
dimens {2, 3} 2 Simulation dimensionality: 2D or 3D
verbose {0, 1, 2} 1 Output verbosity: 0 (silent), 1 (normal), 2 (debug)
system str "swarm" System type: currently only "swarm" supported
strategy str "shepherding" Swarming strategy (planner technique to use)
random_seed int 42 Random number seed (for reproducibility)
f int 0 [Deprecated] Legacy parameter
experimental_save bool false Save to experiments folder (consolidates data, plots, configs)
obstacle_avoidance_strategy str "flocking_saber" Strategy for obstacle avoidance in navigation

Running Simulation Scenarios

Set strategy to one of:

  • "flocking_saber" - Olfati-Saber flocking with obstacles
  • "flocking_reynolds" - Reynolds boids flocking
  • "lemniscates" - Figure-8 trajectories
  • "encirclement" - Circular formation rotation
  • "pinning_lattice" - Heterogeneous lattice with learning
  • "shepherding" - Shepherd-herd dynamics
  • "flocking_starling" - Bio-inspired murmurations
  • "malicious_agent" - Robust flocking with adversarial agents

Learning Modules

Location: learner/

The simulator includes several adaptive learning strategies that allow agents to optimize their behavior during simulation.

Architecture

learner/conductor.py initializes and coordinates learning modules:

def initialize(Agents, tactic_type, learning_ctrl, Ts, config):
    Learners = {}
    # Initialize appropriate learners based on strategy
    return Learners

1. CALA (Collaborative Adaptive Learning Algorithm)

Module: learner/CALA_control.py

Gradient-free, distributed learning algorithm for multi-agent parameter optimization.

Parameters (learner.CALA section):

Parameter Type Default Meaning
Actions
actions_range str "angular" Action space: "angular" (angles/directions) or "linear"
action_min, action_max float -0.785, 0.785 Action bounds [radians] (±45°)
Learning Dynamics
learning_rate float 0.5 Parameter step size (how much to adjust per update)
variance_init float 0.4 Initial exploration variance
variance_ratio float 0.5 Variance decay ratio (annealing factor)
variance_min, variance_max float 0.0001, 10 Variance bounds
epsilon float 1e-6 Regularization constant
Update Frequency
counter_max int 100 Update threshold (compute gradient after 100 samples)
counter_synch bool true Synchronize updates across agents
counter_delay int 500 Delay before update [timesteps] (let system stabilize)
Exploration
explore_dirs bool true Explore random directions (drift + gradient)
explore_persistence float 0.7 Exploration persistence (0=random, 1=follow direction)
Multi-Agent Coordination
leader_follower bool true Use leader-follower structure (coordinator + agents)
leader int 0 Leader agent index
Reward Design
reward_mode str "target" Reward source: "target" (distance to goal) or "swarm" (cohesion)
reward_coupling float 2 Reward coupling strength (how much agents influence each other)
reward_reference str "global" Reference frame: "global" (fixed) or "relative" (moving with swarm)
reward_form str "sharp" Reward function shape: "sharp" (discontinuous) or "smooth" (continuous)
reward_k_theta float 12.0 Reward steepness parameter
Advanced Options
momentum bool false Enable momentum (accumulate direction)
momentum_beta float 0.8 Momentum decay coefficient
annealing bool false Variance annealing (gradual decrease)
annealing_rate float 0.99 Annealing decay per cycle
kicking bool false Kicking to escape local optima
kicking_factor float 1.3 Kick magnitude factor
sigmoidize bool false Apply sigmoid to actions (smooth clipping)

Use Cases:

  • Learning lemniscate trajectory shape (with "lemniscates" strategy)
  • Optimizing flocking parameters online
  • Global learning controller (orchestrator.learning_ctrl = "CALA")

2. Q-Learning for Lattice (Pinning Lattice Optimization)

Module: learner/QL_learning_lattice.py

Reinforcement learning module that optimizes desired inter-agent distances for pinning lattice control.

How It Works:

  1. Discretizes distance space into grid (default 10×10)
  2. Each agent learns optimal distance for its local neighborhood
  3. Q-values trained from reward signal (connectivity vs. energy)
  4. Works in conjunction with consensus lattice mechanism

Integration:

  • Enabled when: planner.techniques.pinning_lattice.learning = 1
  • Requires: hetero_lattice = 1 (consensus mechanism)

3. Consensus Lattice

Module: learner/consensus_lattice.py

Cooperative agreement mechanism for heterogeneous lattice formation. Agents negotiate inter-agent distances via consensus algorithm.

Algorithm:

d_i(t+1) = d_i(t) + α Σ_j (d_j(t) - d_i(t))
           ^ local distance   ^ agreement with neighbors

Integration:

  • Automatically enabled when: pinning_lattice.hetero_lattice = 1
  • Agents in same neighborhood converge to shared desired distance
  • Enables topology-aware formation assembly

Obstacles & Targets

Obstacles

Location: obstacles/obstacles.py

Obstacles are static or moving barriers that agents must avoid.

Configuration (obstacles section):

Parameter Type Default Meaning
nObs int 1 Number of obstacles
vehObs {0, 1} 0 Include agents as obstacles: 1 = agents repel each other in addition to targets
oSpread float 20 Random spread radius [distance units] around target (if not manual)
manual bool true Manual placement: true = use manual_positions, false = random
Manual Positioning
manual_positions.x, .y, .z float -2.4, 1.2, 25 Obstacle center coordinates
manual_positions.radius float 1 Obstacle radius (sphere size)

Obstacle Structure (internal):

obstacles = [x0..xn,      # positions
             y0..yn,
             z0..zn,
             r0..rn]      # radii

Note: Reynolds flocking automatically treats the target as an obstacle (repulsive zone).

Targets

Location: targets/targets.py

Goals that agents navigate toward or swarm around.

Configuration (targets section):

Parameter Type Default Meaning
tSpeed float 0 Target velocity [m/s] (moving target speed)
initial_position [x, y, z] [0, 0, 15] Initial target location

Example Target Trajectories (implemented in targets.py):

# Sinusoidal motion (current)
targets.x = 100·sin(tSpeed·t)
targets.y = 100·sin(tSpeed·tcos(tSpeed·t)
targets.z = 100·sin(tSpeed·tsin(tSpeed·t) + 15

# Can be modified for circular, spiral, or custom paths

Visualization & Data

Data Manager

Location: data/data_manager.py

Handles data recording and I/O.

Configuration (data section):

Parameter Type Default Meaning
save_data bool true Save simulation results
data_dir str "data/data/" Output directory
data_file str "data.h5" Output filename (HDF5 format)
record_interval int 1 Record every N timesteps (1 = every timestep)

Recorded Data (History object):

t_all              # Time vector
states_all         # Agent positions and velocities over time
cmds_all           # Control commands issued
targets_all        # Target positions
obstacles_all      # Obstacle positions
centroid_all       # Swarm center of mass
f_all              # Fitness/reward values (if learning enabled)

Visualization

Location: visualization/

Two main visualization tools:

  1. Animation (visualization/animation_sim.py)

    • Generates 2D/3D animations of swarm behavior
    • Shows agent positions, velocities, interactions
    • Supports various agent shapes and colors
  2. Plotting (visualization/plot_sim.py)

    • Generates publication-quality plots
    • Position trajectories, velocity profiles
    • Swarm metrics (dispersion, centroid, energy)

Output Folders:

  • visualization/animations/ → animated GIFs
  • visualization/plots/ → static plots (PNG, PDF)
  • visualization/public/ → example outputs organized by technique

Utilities & Graph Tools

Swarm Graph Utilities

Location: utils/swarmgraph.py

Graph representation and analysis tools for swarm topology.

Functions:

  • build_graph() - Create adjacency matrix from agent positions
  • graph_metrics() - Compute connectivity, centrality measures
  • is_connected() - Check if swarm is fully connected
  • update_edges() - Dynamic graph updates as agents move

Modeling Utilities

Location: utils/modeller.py

Trajectory and state estimation functions.


Running Simulations

Basic Usage

  1. Configure your simulation in config/config.json:

    {
      "simulation": {
        "strategy": "shepherding",
        "Tf": 30,
        "dimens": 2
      },
      "agents": {
        "nAgents": 20,
        "dynamics": "double integrator"
      }
    }
  2. Run the simulation:

    python main.py
  3. Visualize results:

    # Animations and plots auto-generated if enabled

Example Scenarios

1. Simple Reynolds Flocking

{
  "simulation": {"strategy": "flocking_reynolds", "Tf": 50},
  "agents": {"nAgents": 30, "rAgents": 0.3},
  "orchestrator": {"pin_selection_method": "nopins"}
}

2. Lemniscate Formation with Learning

{
  "simulation": {"strategy": "lemniscates", "Tf": 60},
  "planner": {
    "techniques": {
      "lemniscates": {
        "lemni_type": 0,
        "learning": "CALA",
        "learning_axes": "xz"
      }
    }
  },
  "learner": {"CALA": {"learning_rate": 0.5}}
}

3. Pinning Lattice with Optimization

{
  "simulation": {"strategy": "pinning_lattice", "Tf": 60},
  "planner": {
    "techniques": {
      "pinning_lattice": {
        "hetero_lattice": 1,
        "learning": 1,
        "flocking_method": "lennard_jones"
      }
    }
  }
}

4. Robust Flocking with Malicious Agents

{
  "simulation": {"strategy": "malicious_agent"},
  "planner": {
    "techniques": {
      "malicious_agent": {
        "mode_malicious": 1,
        "mal_type": "collider"
      }
    }
  }
}

Architecture Diagram

┌─────────────────────────────────────────────────────────┐
│                   main.py                               │
│  1. Load config/config.json                             │
│  2. Initialize orchestrator.build_system()              │
│  3. Run main simulation loop                            │
└─────────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────────┐
│            orchestrator.Controller                       │
│  • Maintains agent states & targets                      │
│  • Computes sensing/connection graphs                    │
│  • Selects pinned agents (if applicable)                 │
│  • Calls planner.compute_cmd() for each agent            │
│  • Updates learning modules                              │
└─────────────────────────────────────────────────────────┘
                        ↓
        ┌───────────────┼───────────────┐
        ↓               ↓               ↓
    ┌─────────┐  ┌──────────┐  ┌──────────────┐
    │ Agents  │  │ Planner  │  │ Learner      │
    │ (state) │  │ (command)│  │ (optimize)   │
    └─────────┘  └──────────┘  └──────────────┘
        ↓               ↓               ↓
    ┌─────────────────────────────────────────────┐
    │         data/data_manager.py                │
    │  Records states, commands, metrics          │
    └─────────────────────────────────────────────┘
        ↓
    ┌────────────────────────────────────────────┐
    │         visualization/                     │
    │  Generates animations and plots             │
    └────────────────────────────────────────────┘

Notes

  • Most parameters are tuned through empirical testing; start with defaults and adjust gains (c1, c2, etc.) to see effects
  • Learning modules are in development; expect updates and refinements
  • Quadcopter dynamics significantly increase computation; use double integrator for prototyping
  • Graph construction criteria (radius vs. aperture) affect neighbor detection; radius is most common
  • See docs/devnotes.md for recent changes and known issues

Last Updated: March 2026
Maintained By: Claude Haiku 4.5