Skip to content

Refactoring of observations function and improve viz#443

Open
vcharraut wants to merge 10 commits into
emerge/temp_trainingfrom
vcha/update-obs-viz
Open

Refactoring of observations function and improve viz#443
vcharraut wants to merge 10 commits into
emerge/temp_trainingfrom
vcha/update-obs-viz

Conversation

@vcharraut
Copy link
Copy Markdown
Collaborator

@vcharraut vcharraut commented May 25, 2026

Summary

This PR updates PufferDrive observation/config naming, replay visualization, and obs HTML generation.

Changes

  • Renamed env config keys for clarity:
    • init_steps -> init_step
    • reward_vehicle_collision -> reward_collision
    • reward_offroad_collision -> reward_offroad
    • max_lane_segment_observations -> obs_slots_lane
    • max_boundary_segment_observations -> obs_slots_boundary
    • max_partner_observations -> obs_slots_partners
    • max_traffic_control_observations -> obs_slots_traffic_controls
    • lane_segment_dropout -> obs_dropout_lane
    • boundary_segment_dropout -> obs_dropout_boundary
    • max_goal_position -> obs_norm_goal_offset_m
    • max_position -> obs_norm_xy_offset_m
    • max_veh_len -> obs_norm_veh_length_m
    • max_veh_width -> obs_norm_veh_width_m
    • max_road_segment_length -> obs_norm_road_seg_length_m
    • max_road_segment_width -> obs_norm_road_seg_width_m
    • max_traffic_control_distance -> obs_range_traffic_control_m
    • agent_obs_max_dist -> obs_range_partner_m
    • road_obs_front_dist -> obs_range_road_front_m
    • road_obs_behind_dist -> obs_range_road_behind_m
    • road_obs_side_dist -> obs_range_road_side_m
    • internal derived counts: obs_lane_segment_count -> obs_slots_lane_kept, obs_boundary_segment_count -> obs_slots_boundary_kept
  • Updated C/Python env bindings, configs, checkpoint configs, docs, scripts, and training architecture key checks to use the new names.
  • Standardized ego observation shape with a single EGO_FEATURES constant instead of classic/jerk-specific ego feature counts.
  • Refactored C observation writing into smaller pieces for ego, target, partner, road, and traffic control observations.
  • Added compact obs HTML frame export path via vec_get_obs_html_frame, reducing Python-side replay state extraction overhead.
  • Expanded interactive replay HTML:
    • policy outputs, values, entropy, action probabilities/densities
    • pool slot counts
    • steering, accel, jerk, lane id
    • Puffer score components
    • improved playback timing/speed controls
  • Added evaluator progress bars for triage/obs HTML generation.
  • Updated Drive backbone observation slicing and added pool_slot_counts() for visualization/debugging.
  • Added shared notebook utilities and refreshed notebooks.
  • Updated saved weight configs to the renamed env keys.

Copilot AI review requested due to automatic review settings May 25, 2026 19:41
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors PufferDrive’s observation-generation pipeline and associated configuration surface, while extending/improving the HTML/visualization tooling for debugging and triage.

Changes:

  • Renames/reshapes many env config keys (obs slot counts, dropouts, normalization scales, and range parameters; init_stepsinit_step; reward rename).
  • Refactors C-side observation construction into helper routines and updates Python/C bindings + Torch encoder to the new slot/dropout semantics.
  • Enhances obs-html rendering by collecting compact per-step arrays (agents/metrics/traffic/policy outputs) and adds a C binding (vec_get_obs_html_frame) to fill them efficiently.

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
weights/tomate/config.yaml Updates env config keys to new obs_* naming scheme and reward/init renames.
weights/salade/config.yaml Same config key migration as tomate.
weights/oignons/config.yaml Adds new experiment config using new obs_* keys.
weights/oignons2/config.yaml Adds new experiment config using new obs_* keys (left-map variant).
tests/test_eval_manager.py Updates tests to use renamed dropout key.
tests/test_drive_config.py Updates tests to use renamed reward keys.
scripts/render_scenario.py Renames CLI flag and env override keys (init_step, obs_dropout_*).
README.md Updates documentation to the renamed collision reward key.
pufferlib/utils.py Updates render CLI invocation to --init-step.
pufferlib/pufferl.py Updates env-yaml key allowlists to new observation key names.
pufferlib/ocean/torch.py Updates policy encoder slicing/shape logic to new slot/dropout/kept counts.
pufferlib/ocean/env_config.h Renames env config fields + INI handler keys for new obs/reward/init naming.
pufferlib/ocean/env_binding.h Adds vec_get_obs_html_frame binding and consolidates ego feature constant export.
pufferlib/ocean/drive/visualize.c Updates visualize CLI flag parsing and Drive struct field names for init/control counts and rewards.
pufferlib/ocean/drive/render.h Updates observation visualization scaling and slot iteration to new obs normalization/range fields.
pufferlib/ocean/drive/README.md Updates init-mode documentation to init_step.
pufferlib/ocean/drive/drivenet.h Aligns ego feature sizing with unified EGO_FEATURES.
pufferlib/ocean/drive/drive.py Updates Python Drive env to new config keys and adds get_obs_html_frame API.
pufferlib/ocean/drive/drive.h Refactors observation writing into helper functions; renames many Drive fields; adds projection helpers.
pufferlib/ocean/drive/drive.c Updates demo/perf setup to new fields and fixes forward() call type.
pufferlib/ocean/drive/datatypes.h Adds explicit traffic-control scope constants.
pufferlib/ocean/drive/binding.c Extends exported agent state/metrics for viz and updates init kwargs unpacking to new keys.
pufferlib/ocean/benchmark/visual_sanity_check.py Updates WOSAC setup to use env.init_step.
pufferlib/ocean/benchmark/metrics_sanity_check.py Updates WOSAC setup to use env.init_step.
pufferlib/ocean/benchmark/manager.py Updates clean-macro overrides to renamed dropout keys.
pufferlib/ocean/benchmark/evaluators/base.py Adds tqdm progress and rewrites obs-html capture to compact-array schema via new binding.
pufferlib/ocean/benchmark/evaluator.py Switches WOSAC init-step sourcing to env.init_step.
pufferlib/config/ocean/drive.ini Renames keys (init_step, reward_*, obs_*) and reorganizes observation-related settings.
notebooks/notebook_utils.py Adds shared notebook helpers/constants for env/policy setup and dimension derivation.
notebooks/06_architecture.ipynb Refactors notebook to use notebook_utils helpers and new obs-slot keys.
notebooks/05_inference.ipynb Refactors notebook setup and updates dimensions/labels for unified ego features + new keys.
notebooks/04_training.ipynb Refactors notebook to use notebook_utils helpers and new obs-slot keys.
notebooks/03_metrics.ipynb Refactors notebook to use notebook_utils helpers and new obs-slot keys.
notebooks/02_rewards.ipynb Refactors notebook to use notebook_utils helpers and fixes goal extraction indices under new layout.
notebooks/01_observations.ipynb Refactors notebook to use notebook_utils helpers and new obs-slot keys.
notebooks/init.py Marks notebooks as a package for importing notebook_utils.
docs/evaluation.md Updates clean-macro docs to the renamed dropout keys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +848 to +858
if (!PyArray_Check(agent_f32_array) || !PyArray_Check(agent_i32_array) || !PyArray_Check(metrics_f32_array)
|| !PyArray_Check(puffer_f32_array) || !PyArray_Check(traffic_i16_array)) {
PyErr_SetString(PyExc_TypeError, "All output arrays must be NumPy arrays");
return NULL;
}

memset(PyArray_DATA(agent_f32_array), 0, PyArray_NBYTES(agent_f32_array));
memset(PyArray_DATA(agent_i32_array), 0, PyArray_NBYTES(agent_i32_array));
memset(PyArray_DATA(metrics_f32_array), 0, PyArray_NBYTES(metrics_f32_array));
memset(PyArray_DATA(puffer_f32_array), 0, PyArray_NBYTES(puffer_f32_array));
memset(PyArray_DATA(traffic_i16_array), 0, PyArray_NBYTES(traffic_i16_array));
Comment on lines +866 to +910
int env_cap = (int) PyArray_DIM(agent_f32_array, 0);
int env_count = vec->num_envs < env_cap ? vec->num_envs : env_cap;
int agent_cap = (int) PyArray_DIM(agent_f32_array, 1);
int agent_f32_fields = (int) PyArray_DIM(agent_f32_array, 2);
int agent_i32_fields = (int) PyArray_DIM(agent_i32_array, 2);
int metric_fields = (int) PyArray_DIM(metrics_f32_array, 2);
int puffer_fields = (int) PyArray_DIM(puffer_f32_array, 2);
int traffic_cap = (int) PyArray_DIM(traffic_i16_array, 1);
int traffic_fields = (int) PyArray_DIM(traffic_i16_array, 2);

for (int e = 0; e < env_count; e++) {
Drive *drive = (Drive *) vec->envs[e];
int agent_count = drive->num_total_agents < agent_cap ? drive->num_total_agents : agent_cap;
int traffic_count = drive->num_traffic_elements < traffic_cap ? drive->num_traffic_elements : traffic_cap;

for (int i = 0; i < agent_count; i++) {
Agent *a = &drive->agents[i];
int f32_base = (e * agent_cap + i) * agent_f32_fields;
int i32_base = (e * agent_cap + i) * agent_i32_fields;
int metrics_base = (e * agent_cap + i) * metric_fields;

agent_f32[f32_base + 0] = a->sim_x;
agent_f32[f32_base + 1] = a->sim_y;
agent_f32[f32_base + 2] = a->sim_z;
agent_f32[f32_base + 3] = a->sim_heading;
agent_f32[f32_base + 4] = a->sim_length;
agent_f32[f32_base + 5] = a->sim_width;
agent_f32[f32_base + 6] = a->sim_speed;
agent_f32[f32_base + 7] = a->steering_angle;
agent_f32[f32_base + 8] = a->a_long;
agent_f32[f32_base + 9] = a->a_lat;
agent_f32[f32_base + 10] = a->jerk_long;
agent_f32[f32_base + 11] = a->jerk_lat;

agent_i32[i32_base + 0] = i;
agent_i32[i32_base + 1] = a->type;
agent_i32[i32_base + 2] = a->sim_valid;
agent_i32[i32_base + 3] = a->active_agent;
agent_i32[i32_base + 4] = a->stopped;
agent_i32[i32_base + 5] = a->removed;
agent_i32[i32_base + 6] = a->current_lane_idx;
agent_i32[i32_base + 7] = -1;

memcpy(&metrics_f32[metrics_base], a->metrics_array, sizeof(float) * NUM_METRICS);
}
Comment on lines +3503 to +3506
return EGO_FEATURES + PARTNER_FEATURES * env->obs_slots_partners
+ ROAD_FEATURES * (env->obs_slots_lane_kept + env->obs_slots_boundary_kept)
+ TRAFFIC_CONTROL_FEATURES * env->obs_slots_traffic_controls + env->reward_conditioning * NUM_REWARD_COEFS
+ env->num_target_waypoints * target_features;
Comment on lines 88 to +92
reward_conditioning = False
reward_randomization = False
reward_goal = 1.0
reward_vehicle_collision = 1.0
reward_offroad_collision = 1.0
reward_collision = 1.5
reward_offroad = 1.5
}

int ego_dim = (env->dynamics_model == JERK) ? EGO_FEATURES_JERK : EGO_FEATURES_CLASSIC;
int ego_dim = EGO_FEATURES;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, why?

Comment on lines +729 to +732
None,
None,
None,
filename=str(path),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this function signature should change?

#define TRAFFIC_CONTROL_SCOPE_TRAFFIC_LIGHTS 0
#define TRAFFIC_CONTROL_SCOPE_TRAFFIC_LIGHTS_STOP_SIGN 1
#define TRAFFIC_CONTROL_SCOPE_ALL 2

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem used in this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants