[Newton] Migrate more envs and mdps to warp#4690
[Newton] Migrate more envs and mdps to warp#4690kellyguo11 merged 5 commits intoisaac-sim:dev/newtonfrom
Conversation
|
Not sure if there's a way to only show the fils changes not in PR #4480. Probably best to merge the dependency first |
|
Migrated envs are showing similar training results. For convergence speed, it seems not relevant here as it's not consistent due to noise added. Final training stats — Warp-only vs baseline
Final training stats — Warp-capture vs baseline
Convergence speed — Warp-only vs baseline
Convergence speed — Warp-capture vs baseline
|
Time performance gainWarp-capture vs baseline (repeat=5 average, timer-only
|
| Task | Base env_step (us) | Capture env_step (avg us) | % change |
|---|---|---|---|
| Isaac-Ant-Warp-v0 (0) | 12450.25 | 5384.53 | -56.8% |
| Isaac-Cartpole-Warp-v0 (1) | 9038.00 | 1357.52 | -85.0% |
| Isaac-Humanoid-Warp-v0 (2) | 20600.74 | 13653.34 | -33.7% |
| Isaac-Reach-Franka-Warp-v0 (3) | 12202.51 | 5863.21 | -52.0% |
| Isaac-Reach-UR10-Warp-v0 (4) | - | - | - |
| Isaac-Velocity-Flat-Anymal-B-Warp-v0 (5) | 38029.59 | 27247.39 | -28.4% |
| Isaac-Velocity-Flat-Anymal-C-Warp-v0 (6) | 37881.29 | 27281.55 | -28.0% |
| Isaac-Velocity-Flat-Anymal-D-Warp-v0 (7) | 39227.52 | 27860.87 | -29.0% |
| Isaac-Velocity-Flat-Cassie-Warp-v0 (8) | 22765.51 | 11213.25 | -50.7% |
| Isaac-Velocity-Flat-G1-Warp-v0 (9) | 39951.73 | 27201.89 | -31.9% |
| Isaac-Velocity-Flat-G1-Warp-v1 (10) | 55177.31 | 42330.15 | -23.3% |
| Isaac-Velocity-Flat-H1-Warp-v0 (11) | 28866.51 | 16818.43 | -41.7% |
| Isaac-Velocity-Flat-Unitree-A1-Warp-v0 (12) | 20112.75 | 10467.07 | -48.0% |
| Isaac-Velocity-Flat-Unitree-Go1-Warp-v0 (13) | 20738.22 | 11996.80 | -42.2% |
| Isaac-Velocity-Flat-Unitree-Go2-Warp-v0 (14) | 18656.75 | 9831.00 | -47.3% |
| Isaac-Velocity-Rough-Anymal-D-Warp-v0 (15) | - | - | - |
|
@hujc7 is this one also ready for merge? |
Let me rebase and go through it one more time. Should be ready. |
… updates - Add manager_call_max_mode field for per-env capture ceiling (min(mode, cap)) - Support dict input for manager_call_config (in addition to JSON string) - Add "Scene" to MANAGER_NAMES for configurable Scene_write_data_to_sim mode - Remove hardcoded WARP_NOT_CAPTURED override from Scene_write_data_to_sim - Add warp_capturable decorator and is_warp_capturable check for mode=2 fallback - Update managers: action, observation, event with warp-first improvements - Update scene_entity_cfg with body_ids_wp resolution - Update train.py CLI arg handling
Warp-first observation, reward, termination, event, and action terms referenced by the 14 verified training-parity envs. Observations: base_pos_z, base_lin_vel, base_ang_vel, projected_gravity, joint_pos, joint_pos_rel, joint_pos_limit_normalized, joint_vel, joint_vel_rel, last_action, generated_commands Rewards: is_alive, is_terminated, lin_vel_z_l2, ang_vel_xy_l2, flat_orientation_l2, joint_torques_l2, joint_vel_l1, joint_vel_l2, joint_acc_l2, joint_deviation_l1, joint_pos_limits, action_rate_l2, action_l2, undesired_contacts, track_lin_vel_xy_exp, track_ang_vel_z_exp Terminations: time_out, root_height_below_minimum, joint_pos_out_of_manual_limit, illegal_contact Events: randomize_rigid_body_com, apply_external_force_torque, reset_root_state_uniform, reset_joints_by_scale, reset_joints_by_offset, push_by_setting_velocity Actions: JointPositionAction, JointEffortAction Terms accessing lazy TimestampedWarpBuffer properties (Tier 2) are marked @warp_capturable(False) to prevent stale data under CUDA graph capture.
Env configs and task-local MDP terms for 14 training-parity verified envs: - Classic: Cartpole, Humanoid, Ant - Locomotion velocity (flat): Anymal-B/C/D, G1-v0/v1, H1, Cassie, Unitree A1/Go1/Go2 - Manipulation: Reach-Franka Per-robot config registrations (gym IDs) and flat env cfgs for all tested locomotion and reach variants. Task-specific MDP terms: - Humanoid: base_yaw_roll, base_up_proj, base_heading_proj, base_angle_to_target, progress_reward, upright_posture_bonus, move_to_target_bonus, power_consumption, joint_pos_limits_penalty_ratio - Velocity: feet_air_time, feet_air_time_positive_biped, feet_slide, track_lin_vel_xy_yaw_frame_exp, track_ang_vel_z_world_exp, stand_still_joint_deviation_l1, terrain_out_of_bounds, terrain_levels_vel - Reach: position_command_error, position_command_error_tanh, orientation_command_error Also includes: - Warp parity tests (3 test files) - WARP_MIGRATION_GAP_ANALYSIS.md (MDP term catalog and per-task usage) - MANAGER_TEST_COVERAGE.md (capturability analysis) - GRAPH_CAPTURE_MIGRATION.md (ArticulationData Tier 1/2/3 property analysis)
- Rewrite obs/reward kernels to consume Tier 1 compound types directly, bypassing lazy Tier 2 properties that break CUDA graph capture - Update GRAPH_CAPTURE_MIGRATION.md and WARP_MIGRATION_GAP_ANALYSIS.md
b87037c to
b17d5c2
Compare
Greptile SummaryThis PR migrates manager-based RL environments and MDP terms to a warp-first implementation with CUDA graph capture support for the Newton physics backend. Infrastructure Changes
MDP Term MigrationMigrated observations, rewards, terminations, events, and actions to warp-first implementations:
Tested Environments14 environments with training parity verified (warp-only and warp-capture vs torch baseline within ±5%):
Test Coverage
Known Technical Debt
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[ManagerBasedRLEnvWarp] --> B[ManagerCallSwitch]
B --> C{Execution Mode}
C -->|Mode 0| D[Stable Managers<br/>isaaclab.managers]
C -->|Mode 1| E[Warp Managers<br/>Uncaptured]
C -->|Mode 2| F[Warp Managers<br/>CUDA Graph Captured]
E --> G[Action Manager]
E --> H[Observation Manager]
E --> I[Event Manager]
E --> J[Reward Manager]
E --> K[Termination Manager]
F --> G
F --> H
F --> I
F --> J
F --> K
G --> L[MDP Terms<br/>Observations/Rewards/Events]
H --> L
I --> L
J --> L
K --> L
L --> M{Term Capturable?}
M -->|Yes| N[Inline Tier 1 Access<br/>Direct kernel dispatch]
M -->|No @warp_capturable| O[Mode Downgrade<br/>2→1]
N --> P[Scene Entity Cfg<br/>joint_mask/body_ids_wp]
O --> Q[Lazy Buffer Access<br/>External Force API]
Last reviewed commit: b17d5c2 |
Additional Comments (1)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
|
can you run |
c908c66 to
ce4a770
Compare
Done. Not sure why there's still files left as it should be part of commit hook? |
Summary
Warp-first manager-based RL environment infrastructure and MDP term migration for Newton.
Infrastructure (commits 1-5, from dependency branch)
ManagerBasedRLEnvWarpwithManagerCallSwitchfor per-manager execution mode control (stable / warp / warp-captured)SceneEntityCfgwithbody_ids_wp,joint_ids_wp,joint_maskfor warp kernel dispatchwarp_capturabledecorator andis_warp_capturablecheck for automatic CUDA graph capture fallbackmanager_call_max_modeper-env capture ceiling (min(configured_mode, cap))Scene_write_data_to_simcapture mode (was hardcoded non-captured)MDP terms (commit 6)
Warp-first observation, reward, termination, event, and action terms verified against torch baselines:
Terms accessing
ArticulationDatalazyTimestampedWarpBufferproperties (Tier 2) are marked@warp_capturable(False)to prevent stale data under CUDA graph capture.Tested env configs (commit 7)
14 envs with training parity verified (warp-only and warp-capture vs torch baseline):
Per-robot gym registrations, flat env cfgs, and task-specific MDP terms (humanoid observations/rewards, velocity rewards/terminations/curriculums, reach rewards).
Disabled envs (included but registration commented out)
Isaac-Velocity-Rough-Anymal-D-Warp-v0: requiresisaaclab_physx(not yet ondev/newton)Isaac-Reach-UR10-Warp-v0: USD asset composition error (broken asset)Documentation
WARP_MIGRATION_GAP_ANALYSIS.md: Full MDP term catalog, per-task usage matrix, migration patternsGRAPH_CAPTURE_MIGRATION.md: ArticulationData Tier 1/2/3 property analysis, capture failure mechanism, proposedmaterialize_derived()fixMANAGER_TEST_COVERAGE.md: Capturability analysisTest plan
@warp_capturable(False)fixisaaclab_physxdependency)Dependencies
dev-newton-warp-mig-manager-based(pending merge intodev/newton)