[Test] Add tests and benchmarks for collector throughput optimizations#3567
Closed
vmoens wants to merge 2 commits intogh/vmoens/248/basefrom
Closed
[Test] Add tests and benchmarks for collector throughput optimizations#3567vmoens wants to merge 2 commits intogh/vmoens/248/basefrom
vmoens wants to merge 2 commits intogh/vmoens/248/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3567
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 4 New Failures, 1 Cancelled Job, 7 Pending, 2 Unrelated FailuresAs of commit a0057af with merge base a4301ee ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Mar 24, 2026
This was referenced Mar 23, 2026
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 86.6211μs | 85.8682μs | 11.6458 KOps/s | 12.3487 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1500ms | 0.1488ms | 6.7190 KOps/s | 7.0529 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1059s | 0.1054s | 9.4863 Ops/s | 9.2432 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5983μs | 2.5906μs | 386.0180 KOps/s | 381.8893 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 39.5416μs | 39.3272μs | 25.4277 KOps/s | 27.1560 KOps/s | |
| test_simple | 0.6831s | 0.5788s | 1.7277 Ops/s | 1.7472 Ops/s | |
| test_transformed | 1.1387s | 1.1107s | 0.9003 Ops/s | 0.8936 Ops/s | |
| test_serial | 1.7775s | 1.7451s | 0.5730 Ops/s | 0.5827 Ops/s | |
| test_parallel | 1.0259s | 1.0242s | 0.9763 Ops/s | 0.9531 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1792ms | 42.2874μs | 23.6477 KOps/s | 22.2946 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 0.4704ms | 23.5894μs | 42.3919 KOps/s | 42.5008 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 58.3810μs | 23.9804μs | 41.7007 KOps/s | 38.5091 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 38.1110μs | 13.0316μs | 76.7367 KOps/s | 77.0612 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.4837ms | 46.1219μs | 21.6817 KOps/s | 21.5106 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 0.4563ms | 26.0529μs | 38.3835 KOps/s | 37.2270 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 0.4598ms | 27.2867μs | 36.6478 KOps/s | 34.6826 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 40.9210μs | 17.4017μs | 57.4658 KOps/s | 64.1690 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.4942ms | 53.8039μs | 18.5860 KOps/s | 19.6844 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 0.4569ms | 28.8030μs | 34.7186 KOps/s | 34.7619 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 0.4548ms | 27.1313μs | 36.8578 KOps/s | 34.9365 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 38.8900μs | 15.7352μs | 63.5517 KOps/s | 64.5837 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.5008ms | 53.0556μs | 18.8482 KOps/s | 18.7668 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 0.4709ms | 31.2202μs | 32.0305 KOps/s | 32.0299 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 0.4624ms | 29.7458μs | 33.6182 KOps/s | 32.1390 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 64.7810μs | 19.5456μs | 51.1624 KOps/s | 54.9365 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 0.4892ms | 52.1460μs | 19.1769 KOps/s | 19.9260 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 0.4584ms | 28.8672μs | 34.6414 KOps/s | 34.5123 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4873ms | 34.4387μs | 29.0371 KOps/s | 31.3198 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 0.4650ms | 19.0521μs | 52.4875 KOps/s | 56.6964 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 0.4960ms | 56.4040μs | 17.7292 KOps/s | 19.1059 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.4657ms | 34.0087μs | 29.4042 KOps/s | 32.1995 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 62.0810μs | 33.1202μs | 30.1931 KOps/s | 28.9274 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 0.4411ms | 20.5144μs | 48.7463 KOps/s | 49.7545 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 0.4830ms | 52.9284μs | 18.8934 KOps/s | 18.9699 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.4575ms | 34.1469μs | 29.2852 KOps/s | 29.3540 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 65.9500μs | 35.2289μs | 28.3858 KOps/s | 28.5764 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 37.8010μs | 20.0692μs | 49.8275 KOps/s | 51.0772 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.4805ms | 55.0603μs | 18.1619 KOps/s | 17.7774 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 0.4591ms | 36.8609μs | 27.1290 KOps/s | 27.1220 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.4587ms | 35.5362μs | 28.1403 KOps/s | 27.5939 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 61.6510μs | 22.3868μs | 44.6693 KOps/s | 44.3833 KOps/s | |
| test_step_and_maybe_reset_fast_path | 87.2053ms | 85.5260ms | 11.6924 Ops/s | 11.1366 Ops/s | |
| test_step_and_maybe_reset_normal | 0.1055s | 0.1040s | 9.6171 Ops/s | 9.2418 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8852s | 0.7564s | 1.3220 Ops/s | 1.2677 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7156s | 0.6157s | 1.6243 Ops/s | 1.5627 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7460s | 1.6576s | 0.6033 Ops/s | 0.5878 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5209s | 1.4403s | 0.6943 Ops/s | 0.6854 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9921s | 1.9222s | 0.5202 Ops/s | 0.5088 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7632s | 1.6978s | 0.5890 Ops/s | 0.5819 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7691s | 4.6268s | 0.2161 Ops/s | 0.2142 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5681s | 4.4202s | 0.2262 Ops/s | 0.2260 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0086s | 1.8879s | 0.5297 Ops/s | 0.5264 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6949s | 1.5888s | 0.6294 Ops/s | 0.6116 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 10.1863ms | 9.9884ms | 100.1166 Ops/s | 100.5964 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 17.3515ms | 11.5535ms | 86.5538 Ops/s | 56.6109 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2182ms | 0.1212ms | 8.2508 KOps/s | 7.6031 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.6347ms | 27.3484ms | 36.5653 Ops/s | 36.4154 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 17.6942ms | 11.5123ms | 86.8635 Ops/s | 55.9695 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 42.5242ms | 41.0299ms | 24.3725 Ops/s | 24.5521 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 12.1213ms | 11.3408ms | 88.1768 Ops/s | 56.9782 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.9484ms | 8.8605ms | 112.8602 Ops/s | 113.5722 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7175ms | 1.5266ms | 655.0491 Ops/s | 636.1047 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6644ms | 0.4311ms | 2.3199 KOps/s | 2.3791 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 30.6135ms | 30.1927ms | 33.1206 Ops/s | 33.5434 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.0519ms | 1.7682ms | 565.5609 Ops/s | 565.7122 Ops/s | |
| test_dqn_speed[False-None] | 1.8849ms | 1.4354ms | 696.6464 Ops/s | 706.7841 Ops/s | |
| test_dqn_speed[False-backward] | 2.0481ms | 1.9631ms | 509.4055 Ops/s | 513.1541 Ops/s | |
| test_dqn_speed[True-None] | 1.0366ms | 0.6114ms | 1.6356 KOps/s | 1.6841 KOps/s | |
| test_dqn_speed[True-backward] | 1.1332ms | 1.0945ms | 913.6411 Ops/s | 807.5767 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6858ms | 0.5672ms | 1.7631 KOps/s | 1.7103 KOps/s | |
| test_ddpg_speed[False-None] | 3.4359ms | 2.9756ms | 336.0687 Ops/s | 349.0732 Ops/s | |
| test_ddpg_speed[False-backward] | 4.4814ms | 4.2178ms | 237.0925 Ops/s | 242.4038 Ops/s | |
| test_ddpg_speed[True-None] | 1.9035ms | 1.5192ms | 658.2353 Ops/s | 665.3558 Ops/s | |
| test_ddpg_speed[True-backward] | 2.7025ms | 2.6248ms | 380.9760 Ops/s | 389.5716 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.8884ms | 1.4776ms | 676.7546 Ops/s | 671.3256 Ops/s | |
| test_sac_speed[False-None] | 9.0785ms | 8.4401ms | 118.4815 Ops/s | 121.6751 Ops/s | |
| test_sac_speed[False-backward] | 12.0358ms | 11.6000ms | 86.2073 Ops/s | 86.6254 Ops/s | |
| test_sac_speed[True-None] | 2.5015ms | 2.3199ms | 431.0490 Ops/s | 427.6579 Ops/s | |
| test_sac_speed[True-backward] | 4.5901ms | 4.3314ms | 230.8722 Ops/s | 224.3369 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.6798ms | 2.2920ms | 436.3029 Ops/s | 415.3176 Ops/s | |
| test_redq_speed[False-None] | 11.6382ms | 11.0698ms | 90.3357 Ops/s | 88.0689 Ops/s | |
| test_redq_speed[False-backward] | 24.4354ms | 19.2989ms | 51.8165 Ops/s | 51.6256 Ops/s | |
| test_redq_speed[True-None] | 5.1094ms | 4.8116ms | 207.8312 Ops/s | 202.3591 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 5.1579ms | 4.7992ms | 208.3681 Ops/s | 200.2564 Ops/s | |
| test_redq_deprec_speed[False-None] | 12.4341ms | 11.8440ms | 84.4308 Ops/s | 83.8649 Ops/s | |
| test_redq_deprec_speed[False-backward] | 17.4868ms | 17.0314ms | 58.7149 Ops/s | 58.0616 Ops/s | |
| test_redq_deprec_speed[True-None] | 5.8529ms | 3.9397ms | 253.8276 Ops/s | 256.9865 Ops/s | |
| test_redq_deprec_speed[True-backward] | 8.0857ms | 7.8711ms | 127.0475 Ops/s | 122.1201 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.5094ms | 3.8035ms | 262.9173 Ops/s | 260.4179 Ops/s | |
| test_td3_speed[False-None] | 8.6848ms | 8.5158ms | 117.4282 Ops/s | 120.3483 Ops/s | |
| test_td3_speed[False-backward] | 12.1670ms | 11.6529ms | 85.8157 Ops/s | 88.9187 Ops/s | |
| test_td3_speed[True-None] | 1.9859ms | 1.9347ms | 516.8712 Ops/s | 509.6013 Ops/s | |
| test_td3_speed[True-backward] | 3.9024ms | 3.7923ms | 263.6947 Ops/s | 262.2540 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.9790ms | 1.9244ms | 519.6316 Ops/s | 514.9424 Ops/s | |
| test_cql_speed[False-None] | 30.6804ms | 27.7116ms | 36.0860 Ops/s | 36.2968 Ops/s | |
| test_cql_speed[False-backward] | 41.5533ms | 37.3846ms | 26.7490 Ops/s | 26.4464 Ops/s | |
| test_cql_speed[True-None] | 13.5411ms | 13.2166ms | 75.6623 Ops/s | 74.4393 Ops/s | |
| test_cql_speed[True-backward] | 19.5632ms | 19.1666ms | 52.1742 Ops/s | 53.0975 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 13.6588ms | 13.2505ms | 75.4688 Ops/s | 75.9628 Ops/s | |
| test_a2c_speed[False-None] | 6.0828ms | 5.6367ms | 177.4083 Ops/s | 178.9845 Ops/s | |
| test_a2c_speed[False-backward] | 12.6311ms | 12.2792ms | 81.4387 Ops/s | 81.8763 Ops/s | |
| test_a2c_speed[True-None] | 4.1672ms | 3.9812ms | 251.1798 Ops/s | 248.2146 Ops/s | |
| test_a2c_speed[True-backward] | 9.9660ms | 9.1811ms | 108.9190 Ops/s | 97.3199 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.4114ms | 3.9730ms | 251.6992 Ops/s | 240.1809 Ops/s | |
| test_ppo_speed[False-None] | 6.2476ms | 6.0256ms | 165.9587 Ops/s | 158.5933 Ops/s | |
| test_ppo_speed[False-backward] | 13.3869ms | 12.9722ms | 77.0880 Ops/s | 73.5798 Ops/s | |
| test_ppo_speed[True-None] | 4.4249ms | 3.9997ms | 250.0211 Ops/s | 243.2014 Ops/s | |
| test_ppo_speed[True-backward] | 9.4834ms | 9.1034ms | 109.8494 Ops/s | 106.2088 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.3745ms | 3.9644ms | 252.2436 Ops/s | 245.3303 Ops/s | |
| test_reinforce_speed[False-None] | 5.1657ms | 4.7772ms | 209.3295 Ops/s | 203.7487 Ops/s | |
| test_reinforce_speed[False-backward] | 8.1751ms | 7.7967ms | 128.2591 Ops/s | 123.8025 Ops/s | |
| test_reinforce_speed[True-None] | 4.2903ms | 3.2173ms | 310.8227 Ops/s | 307.6721 Ops/s | |
| test_reinforce_speed[True-backward] | 8.9160ms | 8.3154ms | 120.2581 Ops/s | 116.7623 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.5401ms | 3.1345ms | 319.0334 Ops/s | 310.4296 Ops/s | |
| test_iql_speed[False-None] | 21.7200ms | 21.0026ms | 47.6130 Ops/s | 46.2171 Ops/s | |
| test_iql_speed[False-backward] | 32.4082ms | 31.6462ms | 31.5994 Ops/s | 30.4714 Ops/s | |
| test_iql_speed[True-None] | 9.4829ms | 8.9217ms | 112.0869 Ops/s | 107.1568 Ops/s | |
| test_iql_speed[True-backward] | 17.6834ms | 17.4167ms | 57.4160 Ops/s | 55.6643 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.1151ms | 8.9332ms | 111.9426 Ops/s | 108.9115 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.2525ms | 6.0741ms | 164.6346 Ops/s | 163.4351 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.1232ms | 0.3823ms | 2.6157 KOps/s | 2.5430 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7440ms | 0.3739ms | 2.6745 KOps/s | 2.8467 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1165ms | 5.8354ms | 171.3675 Ops/s | 169.0924 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.4514ms | 0.3783ms | 2.6434 KOps/s | 3.2637 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6235ms | 0.3655ms | 2.7363 KOps/s | 3.4727 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.9455ms | 1.4628ms | 683.6374 Ops/s | 723.6081 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6266ms | 1.3893ms | 719.7705 Ops/s | 773.6588 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 10.2110ms | 6.1115ms | 163.6264 Ops/s | 164.4860 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.0537ms | 0.5351ms | 1.8687 KOps/s | 2.0956 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7287ms | 0.5169ms | 1.9346 KOps/s | 2.1943 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0105ms | 5.8569ms | 170.7399 Ops/s | 169.4254 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8035ms | 0.3899ms | 2.5650 KOps/s | 2.8449 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5735ms | 0.3608ms | 2.7719 KOps/s | 2.5767 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1111ms | 5.7643ms | 173.4804 Ops/s | 170.3112 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.2335ms | 0.3425ms | 2.9196 KOps/s | 2.7959 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5447ms | 0.3038ms | 3.2912 KOps/s | 3.0294 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1402ms | 5.9560ms | 167.8976 Ops/s | 166.4414 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3244ms | 0.4880ms | 2.0492 KOps/s | 1.7467 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7095ms | 0.4888ms | 2.0457 KOps/s | 1.8374 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4164ms | 5.0628ms | 197.5182 Ops/s | 50.1798 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.1552ms | 2.0414ms | 489.8642 Ops/s | 542.7833 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.1145ms | 0.9713ms | 1.0295 KOps/s | 1.0111 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6564s | 18.1894ms | 54.9770 Ops/s | 194.5181 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 11.7122ms | 2.1125ms | 473.3724 Ops/s | 551.9697 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 7.2433ms | 1.2793ms | 781.6475 Ops/s | 1.0381 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.8187ms | 5.2796ms | 189.4097 Ops/s | 186.6272 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 13.9771ms | 2.1561ms | 463.8028 Ops/s | 498.8118 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.4906ms | 1.1272ms | 887.1611 Ops/s | 858.5860 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 44.1692ms | 39.9477ms | 25.0328 Ops/s | 25.0094 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.7880ms | 19.1797ms | 52.1385 Ops/s | 53.0712 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 45.5397ms | 41.5970ms | 24.0402 Ops/s | 24.1533 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 21.4341ms | 19.5720ms | 51.0934 Ops/s | 53.5111 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.9037ms | 42.9147ms | 23.3020 Ops/s | 22.7193 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 22.8237ms | 20.8926ms | 47.8639 Ops/s | 49.0439 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.9728ms | 0.2365ms | 4.2279 KOps/s | 4.3367 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.6322ms | 1.4521ms | 688.6729 Ops/s | 721.6422 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.5377ms | 2.3263ms | 429.8641 Ops/s | 415.9828 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1109ms | 2.9398ms | 340.1595 Ops/s | 338.0660 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2448ms | 0.1384ms | 7.2229 KOps/s | 7.3192 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3427ms | 0.1885ms | 5.3048 KOps/s | 4.9226 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.8835ms | 1.7721ms | 564.2887 Ops/s | 571.6501 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5047ms | 1.3109ms | 762.8141 Ops/s | 775.7378 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3332ms | 1.1352ms | 880.8721 Ops/s | 876.7077 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 7.5751ms | 3.6708ms | 272.4222 Ops/s | 279.8430 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.2430ms | 5.6947ms | 175.6014 Ops/s | 176.5123 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 12.2660ms | 7.4050ms | 135.0435 Ops/s | 141.9846 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4739ms | 0.2923ms | 3.4214 KOps/s | 3.6067 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7368ms | 1.5517ms | 644.4394 Ops/s | 668.1758 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.6125ms | 2.4695ms | 404.9404 Ops/s | 397.4790 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4299ms | 3.1648ms | 315.9774 Ops/s | 317.0195 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 33.1833ms | 32.6696ms | 30.6095 Ops/s | 30.7480 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 67.1202ms | 65.4248ms | 15.2847 Ops/s | 15.4015 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 39.7728ms | 37.9393ms | 26.3579 Ops/s | 26.8934 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 88.2179ms | 75.1779ms | 13.3018 Ops/s | 13.7069 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Cover all 7 performance features: _skip_maybe_reset, _StepMDP out= reuse,
_trust_step_output, update_traj_ids, combined optimization flags,
torch.compile fullgraph, and fast-path benchmarks.
Made-with: Cursor