Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

kurtamohler
Copy link
Contributor

@kurtamohler kurtamohler commented Feb 4, 2025

[ghstack-poisoned]
kurtamohler added a commit that referenced this pull request Feb 4, 2025
ghstack-source-id: 9b5d08d
Pull Request resolved: #2757
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 4, 2025
@kurtamohler kurtamohler requested a review from vmoens February 4, 2025 22:36
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2757

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ You can merge normally! (4 Unrelated Failures)

As of commit 631d908 with merge base 4c06ce2 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link

github-actions bot commented Feb 4, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5437s 0.4579s 2.1837 Ops/s 2.2023 Ops/s $\color{#d91a1a}-0.84\%$
test_transformed 1.0383s 0.9540s 1.0482 Ops/s 1.1056 Ops/s $\textbf{\color{#d91a1a}-5.20\%}$
test_serial 1.3876s 1.3839s 0.7226 Ops/s 0.7285 Ops/s $\color{#d91a1a}-0.81\%$
test_parallel 1.3183s 1.2263s 0.8155 Ops/s 0.8182 Ops/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-True-True-True] 0.1986ms 30.4747μs 32.8141 KOps/s 33.0770 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[True-True-True-True-False] 57.1670μs 18.1950μs 54.9602 KOps/s 56.1746 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[True-True-True-False-True] 58.4690μs 17.2891μs 57.8398 KOps/s 58.8119 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[True-True-True-False-False] 33.9230μs 10.2248μs 97.8019 KOps/s 99.0904 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-True-False-True-True] 87.5510μs 32.0896μs 31.1628 KOps/s 31.1915 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-True-False-True-False] 44.5630μs 19.9296μs 50.1766 KOps/s 50.2143 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-True-False-False-True] 0.6574ms 19.0366μs 52.5304 KOps/s 52.4668 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-False-False-False] 35.8370μs 12.0851μs 82.7462 KOps/s 83.2444 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[True-False-True-True-True] 84.8390μs 34.2148μs 29.2271 KOps/s 29.2654 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-False-True-True-False] 70.2010μs 21.8473μs 45.7723 KOps/s 46.0739 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[True-False-True-False-True] 65.2220μs 19.0590μs 52.4686 KOps/s 52.7153 KOps/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[True-False-True-False-False] 64.6510μs 12.0735μs 82.8257 KOps/s 83.3573 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-False-True-True] 84.2980μs 36.4454μs 27.4383 KOps/s 27.8534 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-False-True-False] 70.1510μs 23.5755μs 42.4170 KOps/s 42.8311 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-False-False-False-True] 73.1670μs 20.9025μs 47.8411 KOps/s 48.6721 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[True-False-False-False-False] 37.7300μs 13.8301μs 72.3059 KOps/s 72.8816 KOps/s $\color{#d91a1a}-0.79\%$
test_step_mdp_speed[False-True-True-True-True] 91.0390μs 34.3957μs 29.0734 KOps/s 29.2493 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-True-True-True-False] 74.9410μs 21.7601μs 45.9557 KOps/s 46.3300 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-True-True-False-True] 65.3000μs 21.8267μs 45.8154 KOps/s 46.2429 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[False-True-True-False-False] 68.2870μs 13.4977μs 74.0864 KOps/s 74.5941 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[False-True-False-True-True] 93.1250μs 35.7055μs 28.0069 KOps/s 28.2857 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[False-True-False-True-False] 73.1950μs 23.5121μs 42.5313 KOps/s 42.6816 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[False-True-False-False-True] 2.4478ms 23.6516μs 42.2804 KOps/s 42.5540 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-True-False-False-False] 56.5350μs 15.2972μs 65.3714 KOps/s 65.7175 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-False-True-True-True] 0.6648ms 38.0039μs 26.3131 KOps/s 26.8502 KOps/s $\color{#d91a1a}-2.00\%$
test_step_mdp_speed[False-False-True-True-False] 69.7890μs 25.4166μs 39.3443 KOps/s 39.8896 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[False-False-True-False-True] 70.3410μs 23.6011μs 42.3709 KOps/s 43.2475 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-True-False-False] 59.0930μs 15.2231μs 65.6895 KOps/s 65.7181 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-False-True-True] 94.0560μs 39.0112μs 25.6336 KOps/s 25.7511 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-False-True-False] 0.1023ms 27.3594μs 36.5505 KOps/s 37.2792 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-False-False-False-True] 75.8120μs 25.0669μs 39.8932 KOps/s 40.1798 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-False-False-False-False] 60.1930μs 16.8145μs 59.4725 KOps/s 59.9268 KOps/s $\color{#d91a1a}-0.76\%$
test_values[generalized_advantage_estimate-True-True] 10.0852ms 9.8422ms 101.6037 Ops/s 101.6382 Ops/s $\color{#d91a1a}-0.03\%$
test_values[vec_generalized_advantage_estimate-True-True] 27.4781ms 25.9920ms 38.4734 Ops/s 41.3267 Ops/s $\textbf{\color{#d91a1a}-6.90\%}$
test_values[td0_return_estimate-False-False] 0.2264ms 0.2012ms 4.9704 KOps/s 4.7115 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_values[td1_return_estimate-False-False] 27.3120ms 24.1461ms 41.4146 Ops/s 41.6439 Ops/s $\color{#d91a1a}-0.55\%$
test_values[vec_td1_return_estimate-False-False] 28.4309ms 26.6459ms 37.5293 Ops/s 41.2855 Ops/s $\textbf{\color{#d91a1a}-9.10\%}$
test_values[td_lambda_return_estimate-True-False] 37.9282ms 34.6338ms 28.8735 Ops/s 28.8471 Ops/s $\color{#35bf28}+0.09\%$
test_values[vec_td_lambda_return_estimate-True-False] 28.4801ms 26.5710ms 37.6351 Ops/s 41.2179 Ops/s $\textbf{\color{#d91a1a}-8.69\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 11.8102ms 8.4956ms 117.7074 Ops/s 118.3471 Ops/s $\color{#d91a1a}-0.54\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.5546ms 1.9880ms 503.0194 Ops/s 494.7346 Ops/s $\color{#35bf28}+1.67\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6922ms 0.3625ms 2.7588 KOps/s 2.7492 KOps/s $\color{#35bf28}+0.35\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 54.2643ms 42.4165ms 23.5757 Ops/s 25.9104 Ops/s $\textbf{\color{#d91a1a}-9.01\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4508ms 3.5319ms 283.1375 Ops/s 288.7736 Ops/s $\color{#d91a1a}-1.95\%$
test_dqn_speed[False-None] 6.0190ms 1.3958ms 716.4387 Ops/s 719.4396 Ops/s $\color{#d91a1a}-0.42\%$
test_dqn_speed[False-backward] 1.9724ms 1.8857ms 530.3197 Ops/s 533.0298 Ops/s $\color{#d91a1a}-0.51\%$
test_dqn_speed[True-None] 0.8111ms 0.4798ms 2.0843 KOps/s 2.0591 KOps/s $\color{#35bf28}+1.23\%$
test_dqn_speed[True-backward] 0.9416ms 0.9048ms 1.1052 KOps/s 752.1154 Ops/s $\textbf{\color{#35bf28}+46.95\%}$
test_dqn_speed[reduce-overhead-None] 0.6927ms 0.4880ms 2.0492 KOps/s 2.0177 KOps/s $\color{#35bf28}+1.56\%$
test_dqn_speed[reduce-overhead-backward] 0.9940ms 0.9220ms 1.0846 KOps/s 1.0746 KOps/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[False-None] 0.1673s 3.3728ms 296.4917 Ops/s 348.3203 Ops/s $\textbf{\color{#d91a1a}-14.88\%}$
test_ddpg_speed[False-backward] 4.3409ms 4.0450ms 247.2183 Ops/s 249.2191 Ops/s $\color{#d91a1a}-0.80\%$
test_ddpg_speed[True-None] 1.6308ms 1.2393ms 806.9115 Ops/s 800.6540 Ops/s $\color{#35bf28}+0.78\%$
test_ddpg_speed[True-backward] 2.2355ms 2.1560ms 463.8135 Ops/s 461.3061 Ops/s $\color{#35bf28}+0.54\%$
test_ddpg_speed[reduce-overhead-None] 1.7530ms 1.2342ms 810.2244 Ops/s 796.4045 Ops/s $\color{#35bf28}+1.74\%$
test_ddpg_speed[reduce-overhead-backward] 2.2217ms 2.1378ms 467.7804 Ops/s 462.8028 Ops/s $\color{#35bf28}+1.08\%$
test_sac_speed[False-None] 8.5489ms 8.0197ms 124.6934 Ops/s 126.0428 Ops/s $\color{#d91a1a}-1.07\%$
test_sac_speed[False-backward] 12.6801ms 10.7556ms 92.9744 Ops/s 93.7810 Ops/s $\color{#d91a1a}-0.86\%$
test_sac_speed[True-None] 2.5344ms 2.0900ms 478.4703 Ops/s 468.7309 Ops/s $\color{#35bf28}+2.08\%$
test_sac_speed[True-backward] 3.8958ms 3.7827ms 264.3583 Ops/s 261.8114 Ops/s $\color{#35bf28}+0.97\%$
test_sac_speed[reduce-overhead-None] 2.6277ms 2.0922ms 477.9728 Ops/s 471.4294 Ops/s $\color{#35bf28}+1.39\%$
test_sac_speed[reduce-overhead-backward] 3.8689ms 3.7618ms 265.8290 Ops/s 261.7248 Ops/s $\color{#35bf28}+1.57\%$
test_redq_speed[False-None] 14.4964ms 12.7283ms 78.5652 Ops/s 77.9159 Ops/s $\color{#35bf28}+0.83\%$
test_redq_speed[False-backward] 23.3685ms 22.0642ms 45.3223 Ops/s 45.2095 Ops/s $\color{#35bf28}+0.25\%$
test_redq_speed[True-None] 5.4491ms 4.8369ms 206.7451 Ops/s 199.3167 Ops/s $\color{#35bf28}+3.73\%$
test_redq_speed[True-backward] 13.8237ms 12.3252ms 81.1345 Ops/s 79.5214 Ops/s $\color{#35bf28}+2.03\%$
test_redq_speed[reduce-overhead-None] 5.4912ms 4.8827ms 204.8045 Ops/s 198.2501 Ops/s $\color{#35bf28}+3.31\%$
test_redq_speed[reduce-overhead-backward] 13.3245ms 12.2801ms 81.4323 Ops/s 79.8346 Ops/s $\color{#35bf28}+2.00\%$
test_redq_deprec_speed[False-None] 13.9650ms 12.8968ms 77.5389 Ops/s 73.5624 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_redq_deprec_speed[False-backward] 20.3675ms 18.5532ms 53.8989 Ops/s 53.9516 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_deprec_speed[True-None] 4.4036ms 3.7931ms 263.6391 Ops/s 259.6311 Ops/s $\color{#35bf28}+1.54\%$
test_redq_deprec_speed[True-backward] 9.1053ms 8.2079ms 121.8339 Ops/s 121.6028 Ops/s $\color{#35bf28}+0.19\%$
test_redq_deprec_speed[reduce-overhead-None] 4.5886ms 3.8830ms 257.5301 Ops/s 259.1995 Ops/s $\color{#d91a1a}-0.64\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.7854ms 8.2442ms 121.2971 Ops/s 120.3352 Ops/s $\color{#35bf28}+0.80\%$
test_td3_speed[False-None] 8.2128ms 7.9753ms 125.3876 Ops/s 124.8717 Ops/s $\color{#35bf28}+0.41\%$
test_td3_speed[False-backward] 12.2184ms 10.4256ms 95.9173 Ops/s 95.9700 Ops/s $\color{#d91a1a}-0.05\%$
test_td3_speed[True-None] 1.9455ms 1.7815ms 561.3332 Ops/s 551.2006 Ops/s $\color{#35bf28}+1.84\%$
test_td3_speed[True-backward] 3.4745ms 3.3747ms 296.3200 Ops/s 291.2053 Ops/s $\color{#35bf28}+1.76\%$
test_td3_speed[reduce-overhead-None] 1.9580ms 1.7703ms 564.8743 Ops/s 546.8918 Ops/s $\color{#35bf28}+3.29\%$
test_td3_speed[reduce-overhead-backward] 3.5073ms 3.3858ms 295.3498 Ops/s 287.5843 Ops/s $\color{#35bf28}+2.70\%$
test_cql_speed[False-None] 39.0189ms 36.3239ms 27.5301 Ops/s 27.2850 Ops/s $\color{#35bf28}+0.90\%$
test_cql_speed[False-backward] 50.6893ms 46.8699ms 21.3357 Ops/s 21.3788 Ops/s $\color{#d91a1a}-0.20\%$
test_cql_speed[True-None] 17.0015ms 15.9207ms 62.8111 Ops/s 62.0138 Ops/s $\color{#35bf28}+1.29\%$
test_cql_speed[True-backward] 24.1789ms 22.6268ms 44.1954 Ops/s 43.4253 Ops/s $\color{#35bf28}+1.77\%$
test_cql_speed[reduce-overhead-None] 17.1639ms 16.0979ms 62.1198 Ops/s 62.2971 Ops/s $\color{#d91a1a}-0.28\%$
test_cql_speed[reduce-overhead-backward] 23.8126ms 22.7955ms 43.8683 Ops/s 43.2695 Ops/s $\color{#35bf28}+1.38\%$
test_a2c_speed[False-None] 7.8213ms 7.1727ms 139.4179 Ops/s 139.7766 Ops/s $\color{#d91a1a}-0.26\%$
test_a2c_speed[False-backward] 16.0462ms 14.3339ms 69.7646 Ops/s 70.6696 Ops/s $\color{#d91a1a}-1.28\%$
test_a2c_speed[True-None] 3.9573ms 3.6952ms 270.6186 Ops/s 266.3287 Ops/s $\color{#35bf28}+1.61\%$
test_a2c_speed[True-backward] 10.6871ms 10.0208ms 99.7925 Ops/s 99.0219 Ops/s $\color{#35bf28}+0.78\%$
test_a2c_speed[reduce-overhead-None] 4.5191ms 3.6916ms 270.8884 Ops/s 268.1113 Ops/s $\color{#35bf28}+1.04\%$
test_a2c_speed[reduce-overhead-backward] 11.3011ms 10.0272ms 99.7283 Ops/s 98.4535 Ops/s $\color{#35bf28}+1.29\%$
test_ppo_speed[False-None] 8.4530ms 7.4880ms 133.5477 Ops/s 132.2198 Ops/s $\color{#35bf28}+1.00\%$
test_ppo_speed[False-backward] 15.1663ms 14.7604ms 67.7490 Ops/s 68.7220 Ops/s $\color{#d91a1a}-1.42\%$
test_ppo_speed[True-None] 5.0331ms 4.0781ms 245.2112 Ops/s 244.1458 Ops/s $\color{#35bf28}+0.44\%$
test_ppo_speed[True-backward] 10.4064ms 9.8781ms 101.2338 Ops/s 100.1121 Ops/s $\color{#35bf28}+1.12\%$
test_ppo_speed[reduce-overhead-None] 5.0805ms 4.0737ms 245.4764 Ops/s 243.9399 Ops/s $\color{#35bf28}+0.63\%$
test_ppo_speed[reduce-overhead-backward] 10.4674ms 9.8612ms 101.4071 Ops/s 100.6624 Ops/s $\color{#35bf28}+0.74\%$
test_reinforce_speed[False-None] 7.6504ms 6.5241ms 153.2780 Ops/s 153.0114 Ops/s $\color{#35bf28}+0.17\%$
test_reinforce_speed[False-backward] 10.4025ms 9.7804ms 102.2456 Ops/s 101.9486 Ops/s $\color{#35bf28}+0.29\%$
test_reinforce_speed[True-None] 3.3544ms 3.0195ms 331.1853 Ops/s 330.0497 Ops/s $\color{#35bf28}+0.34\%$
test_reinforce_speed[True-backward] 9.5786ms 8.8957ms 112.4145 Ops/s 112.2574 Ops/s $\color{#35bf28}+0.14\%$
test_reinforce_speed[reduce-overhead-None] 3.6798ms 3.0223ms 330.8778 Ops/s 327.5466 Ops/s $\color{#35bf28}+1.02\%$
test_reinforce_speed[reduce-overhead-backward] 9.9105ms 8.9264ms 112.0277 Ops/s 111.0754 Ops/s $\color{#35bf28}+0.86\%$
test_iql_speed[False-None] 33.0365ms 31.8278ms 31.4190 Ops/s 30.4832 Ops/s $\color{#35bf28}+3.07\%$
test_iql_speed[False-backward] 47.0337ms 44.6353ms 22.4038 Ops/s 21.9564 Ops/s $\color{#35bf28}+2.04\%$
test_iql_speed[True-None] 12.2519ms 11.0285ms 90.6745 Ops/s 88.2493 Ops/s $\color{#35bf28}+2.75\%$
test_iql_speed[True-backward] 23.8710ms 22.0807ms 45.2884 Ops/s 44.6611 Ops/s $\color{#35bf28}+1.40\%$
test_iql_speed[reduce-overhead-None] 12.0556ms 11.1314ms 89.8362 Ops/s 88.4970 Ops/s $\color{#35bf28}+1.51\%$
test_iql_speed[reduce-overhead-backward] 22.7721ms 21.9369ms 45.5853 Ops/s 44.5910 Ops/s $\color{#35bf28}+2.23\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4385ms 4.8276ms 207.1413 Ops/s 203.7983 Ops/s $\color{#35bf28}+1.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9009ms 0.5372ms 1.8616 KOps/s 1.8732 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7501ms 0.5119ms 1.9535 KOps/s 1.9499 KOps/s $\color{#35bf28}+0.18\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7873ms 4.5370ms 220.4115 Ops/s 213.4718 Ops/s $\color{#35bf28}+3.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.6048ms 0.5240ms 1.9083 KOps/s 1.9073 KOps/s $\color{#35bf28}+0.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8558ms 0.5041ms 1.9839 KOps/s 1.9963 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2126ms 1.6988ms 588.6649 Ops/s 589.3115 Ops/s $\color{#d91a1a}-0.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2476ms 1.6180ms 618.0401 Ops/s 620.3871 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5174ms 4.7723ms 209.5433 Ops/s 205.8424 Ops/s $\color{#35bf28}+1.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.7217ms 0.6740ms 1.4836 KOps/s 1.4938 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9401ms 0.6494ms 1.5398 KOps/s 1.5472 KOps/s $\color{#d91a1a}-0.48\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.2691ms 4.6443ms 215.3174 Ops/s 210.5336 Ops/s $\color{#35bf28}+2.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.6766ms 0.5398ms 1.8527 KOps/s 1.8758 KOps/s $\color{#d91a1a}-1.23\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8105ms 0.5147ms 1.9429 KOps/s 1.9332 KOps/s $\color{#35bf28}+0.50\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2994ms 4.6021ms 217.2911 Ops/s 216.3074 Ops/s $\color{#35bf28}+0.45\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8687ms 0.5282ms 1.8932 KOps/s 1.9050 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8215ms 0.5106ms 1.9586 KOps/s 1.9718 KOps/s $\color{#d91a1a}-0.67\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8918ms 4.7457ms 210.7161 Ops/s 208.2217 Ops/s $\color{#35bf28}+1.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.7797ms 0.6752ms 1.4811 KOps/s 1.4806 KOps/s $\color{#35bf28}+0.03\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9161ms 0.6514ms 1.5352 KOps/s 1.5562 KOps/s $\color{#d91a1a}-1.35\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4432ms 4.1950ms 238.3799 Ops/s 247.9693 Ops/s $\color{#d91a1a}-3.87\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.4591ms 2.2323ms 447.9667 Ops/s 413.0767 Ops/s $\textbf{\color{#35bf28}+8.45\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.2330ms 1.4370ms 695.9153 Ops/s 722.7077 Ops/s $\color{#d91a1a}-3.71\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3935s 12.1252ms 82.4729 Ops/s 36.6469 Ops/s $\textbf{\color{#35bf28}+125.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.4635ms 2.3161ms 431.7634 Ops/s 427.2284 Ops/s $\color{#35bf28}+1.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.8293ms 1.3563ms 737.2771 Ops/s 744.2426 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7077ms 4.3764ms 228.5004 Ops/s 224.9647 Ops/s $\color{#35bf28}+1.57\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.7163ms 2.5328ms 394.8126 Ops/s 390.5278 Ops/s $\color{#35bf28}+1.10\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.2813ms 1.4649ms 682.6408 Ops/s 641.5236 Ops/s $\textbf{\color{#35bf28}+6.41\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.0030ms 11.5703ms 86.4281 Ops/s 82.8552 Ops/s $\color{#35bf28}+4.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 14.7272ms 13.9749ms 71.5571 Ops/s 68.4913 Ops/s $\color{#35bf28}+4.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.3830ms 20.4145ms 48.9849 Ops/s 47.6739 Ops/s $\color{#35bf28}+2.75\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 23.5044ms 14.6825ms 68.1081 Ops/s 68.4622 Ops/s $\color{#d91a1a}-0.52\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.6316ms 20.2941ms 49.2755 Ops/s 47.8322 Ops/s $\color{#35bf28}+3.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.6560ms 15.5329ms 64.3794 Ops/s 63.0389 Ops/s $\color{#35bf28}+2.13\%$

@github-actions
Copy link

github-actions bot commented Feb 4, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}28$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8172s 0.7336s 1.3631 Ops/s 1.3583 Ops/s $\color{#35bf28}+0.35\%$
test_transformed 1.2670s 1.2659s 0.7900 Ops/s 0.7612 Ops/s $\color{#35bf28}+3.79\%$
test_serial 2.1034s 2.1011s 0.4759 Ops/s 0.4679 Ops/s $\color{#35bf28}+1.72\%$
test_parallel 1.8471s 1.8218s 0.5489 Ops/s 0.5414 Ops/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-True-True-True-True] 0.1881ms 38.4382μs 26.0158 KOps/s 26.5442 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[True-True-True-True-False] 53.4210μs 22.3255μs 44.7919 KOps/s 44.5984 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-True-True-False-True] 51.4410μs 21.0276μs 47.5566 KOps/s 47.2296 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[True-True-True-False-False] 68.9410μs 12.0404μs 83.0534 KOps/s 79.3643 KOps/s $\color{#35bf28}+4.65\%$
test_step_mdp_speed[True-True-False-True-True] 85.8120μs 40.1383μs 24.9139 KOps/s 24.1922 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[True-True-False-True-False] 76.0110μs 24.3633μs 41.0453 KOps/s 40.3080 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[True-True-False-False-True] 57.4010μs 23.1625μs 43.1733 KOps/s 41.4035 KOps/s $\color{#35bf28}+4.27\%$
test_step_mdp_speed[True-True-False-False-False] 42.6800μs 14.5484μs 68.7363 KOps/s 66.2981 KOps/s $\color{#35bf28}+3.68\%$
test_step_mdp_speed[True-False-True-True-True] 0.7236ms 42.5144μs 23.5214 KOps/s 22.9346 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[True-False-True-True-False] 52.9010μs 26.2736μs 38.0610 KOps/s 37.0758 KOps/s $\color{#35bf28}+2.66\%$
test_step_mdp_speed[True-False-True-False-True] 55.2210μs 23.4613μs 42.6233 KOps/s 41.8931 KOps/s $\color{#35bf28}+1.74\%$
test_step_mdp_speed[True-False-True-False-False] 42.7310μs 14.6721μs 68.1567 KOps/s 67.9551 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-False-False-True-True] 77.2220μs 44.7816μs 22.3306 KOps/s 22.6197 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[True-False-False-True-False] 55.1800μs 28.2277μs 35.4262 KOps/s 34.3073 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[True-False-False-False-True] 0.2134ms 24.8474μs 40.2456 KOps/s 39.1524 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[True-False-False-False-False] 48.3810μs 16.6194μs 60.1705 KOps/s 59.5446 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-True-True-True] 72.6010μs 42.6570μs 23.4428 KOps/s 23.4458 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-True-True-True-False] 0.1077ms 26.3496μs 37.9512 KOps/s 38.1988 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-True-True-False-True] 2.6853ms 27.0325μs 36.9925 KOps/s 37.2384 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[False-True-True-False-False] 47.5210μs 16.0162μs 62.4369 KOps/s 63.4053 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-True-False-True-True] 86.8210μs 44.5789μs 22.4321 KOps/s 22.3959 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[False-True-False-True-False] 59.6610μs 28.5658μs 35.0069 KOps/s 34.5277 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-False-False-True] 65.2710μs 29.3242μs 34.1015 KOps/s 33.1056 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[False-True-False-False-False] 45.5000μs 18.1913μs 54.9714 KOps/s 54.0285 KOps/s $\color{#35bf28}+1.75\%$
test_step_mdp_speed[False-False-True-True-True] 81.5020μs 46.7760μs 21.3785 KOps/s 21.6983 KOps/s $\color{#d91a1a}-1.47\%$
test_step_mdp_speed[False-False-True-True-False] 64.8110μs 31.3350μs 31.9132 KOps/s 31.4722 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-False-True-False-True] 62.9010μs 29.3840μs 34.0321 KOps/s 33.7533 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-False-True-False-False] 43.2910μs 18.3914μs 54.3731 KOps/s 54.0085 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-False-False-True-True] 82.6210μs 48.7745μs 20.5025 KOps/s 20.4818 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-False-False-True-False] 61.7210μs 32.9963μs 30.3064 KOps/s 29.8156 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[False-False-False-False-True] 72.9710μs 30.7594μs 32.5104 KOps/s 31.7397 KOps/s $\color{#35bf28}+2.43\%$
test_step_mdp_speed[False-False-False-False-False] 56.9310μs 20.5904μs 48.5664 KOps/s 48.1192 KOps/s $\color{#35bf28}+0.93\%$
test_values[generalized_advantage_estimate-True-True] 26.8908ms 24.9457ms 40.0871 Ops/s 40.3804 Ops/s $\color{#d91a1a}-0.73\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1222s 3.3439ms 299.0542 Ops/s 305.3724 Ops/s $\color{#d91a1a}-2.07\%$
test_values[td0_return_estimate-False-False] 0.1064ms 79.7556μs 12.5383 KOps/s 12.5084 KOps/s $\color{#35bf28}+0.24\%$
test_values[td1_return_estimate-False-False] 56.2864ms 55.3517ms 18.0663 Ops/s 18.0633 Ops/s $\color{#35bf28}+0.02\%$
test_values[vec_td1_return_estimate-False-False] 1.2927ms 1.0864ms 920.4589 Ops/s 918.8694 Ops/s $\color{#35bf28}+0.17\%$
test_values[td_lambda_return_estimate-True-False] 88.5104ms 87.5362ms 11.4238 Ops/s 10.8961 Ops/s $\color{#35bf28}+4.84\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3934ms 1.0922ms 915.5485 Ops/s 890.3609 Ops/s $\color{#35bf28}+2.83\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.2030ms 24.8199ms 40.2903 Ops/s 38.1464 Ops/s $\textbf{\color{#35bf28}+5.62\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0471ms 0.7544ms 1.3256 KOps/s 1.2530 KOps/s $\textbf{\color{#35bf28}+5.80\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8081ms 0.6735ms 1.4848 KOps/s 1.4092 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6407ms 1.4852ms 673.3214 Ops/s 658.9621 Ops/s $\color{#35bf28}+2.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8338ms 0.6905ms 1.4482 KOps/s 1.3931 KOps/s $\color{#35bf28}+3.95\%$
test_dqn_speed[False-None] 6.9309ms 1.5131ms 660.8884 Ops/s 665.7307 Ops/s $\color{#d91a1a}-0.73\%$
test_dqn_speed[False-backward] 2.1598ms 2.1099ms 473.9629 Ops/s 469.0834 Ops/s $\color{#35bf28}+1.04\%$
test_dqn_speed[True-None] 0.6835ms 0.5365ms 1.8639 KOps/s 1.8286 KOps/s $\color{#35bf28}+1.93\%$
test_dqn_speed[True-backward] 1.1741ms 1.0949ms 913.3043 Ops/s 826.3046 Ops/s $\textbf{\color{#35bf28}+10.53\%}$
test_dqn_speed[reduce-overhead-None] 0.7100ms 0.5576ms 1.7933 KOps/s 1.7553 KOps/s $\color{#35bf28}+2.17\%$
test_dqn_speed[reduce-overhead-backward] 0.9974ms 0.9374ms 1.0668 KOps/s 1.0491 KOps/s $\color{#35bf28}+1.68\%$
test_ddpg_speed[False-None] 3.1340ms 2.8311ms 353.2161 Ops/s 348.3508 Ops/s $\color{#35bf28}+1.40\%$
test_ddpg_speed[False-backward] 4.4468ms 4.1021ms 243.7756 Ops/s 241.0595 Ops/s $\color{#35bf28}+1.13\%$
test_ddpg_speed[True-None] 1.4885ms 1.3129ms 761.6912 Ops/s 751.6750 Ops/s $\color{#35bf28}+1.33\%$
test_ddpg_speed[True-backward] 2.4279ms 2.3666ms 422.5444 Ops/s 391.3956 Ops/s $\textbf{\color{#35bf28}+7.96\%}$
test_ddpg_speed[reduce-overhead-None] 1.4630ms 1.3183ms 758.5537 Ops/s 749.0104 Ops/s $\color{#35bf28}+1.27\%$
test_ddpg_speed[reduce-overhead-backward] 1.9819ms 1.8496ms 540.6440 Ops/s 495.9168 Ops/s $\textbf{\color{#35bf28}+9.02\%}$
test_sac_speed[False-None] 8.3235ms 7.9076ms 126.4614 Ops/s 124.3796 Ops/s $\color{#35bf28}+1.67\%$
test_sac_speed[False-backward] 11.3989ms 10.8572ms 92.1052 Ops/s 88.9807 Ops/s $\color{#35bf28}+3.51\%$
test_sac_speed[True-None] 1.9548ms 1.7926ms 557.8536 Ops/s 551.8313 Ops/s $\color{#35bf28}+1.09\%$
test_sac_speed[True-backward] 3.6391ms 3.4696ms 288.2177 Ops/s 268.0415 Ops/s $\textbf{\color{#35bf28}+7.53\%}$
test_sac_speed[reduce-overhead-None] 21.0149ms 11.9491ms 83.6884 Ops/s 83.6245 Ops/s $\color{#35bf28}+0.08\%$
test_sac_speed[reduce-overhead-backward] 1.6449ms 1.5862ms 630.4379 Ops/s 553.7473 Ops/s $\textbf{\color{#35bf28}+13.85\%}$
test_redq_speed[False-None] 7.7830ms 7.3596ms 135.8762 Ops/s 130.8106 Ops/s $\color{#35bf28}+3.87\%$
test_redq_speed[False-backward] 11.6631ms 11.2209ms 89.1198 Ops/s 84.6074 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_redq_speed[True-None] 2.4342ms 2.2437ms 445.6988 Ops/s 417.6697 Ops/s $\textbf{\color{#35bf28}+6.71\%}$
test_redq_speed[True-backward] 4.1307ms 3.9452ms 253.4738 Ops/s 250.6367 Ops/s $\color{#35bf28}+1.13\%$
test_redq_speed[reduce-overhead-None] 2.4166ms 2.2619ms 442.0993 Ops/s 420.8108 Ops/s $\textbf{\color{#35bf28}+5.06\%}$
test_redq_speed[reduce-overhead-backward] 4.1706ms 3.9223ms 254.9531 Ops/s 238.6853 Ops/s $\textbf{\color{#35bf28}+6.82\%}$
test_redq_deprec_speed[False-None] 9.4490ms 8.9235ms 112.0631 Ops/s 108.3846 Ops/s $\color{#35bf28}+3.39\%$
test_redq_deprec_speed[False-backward] 12.3726ms 11.9080ms 83.9775 Ops/s 78.0462 Ops/s $\textbf{\color{#35bf28}+7.60\%}$
test_redq_deprec_speed[True-None] 2.7483ms 2.5949ms 385.3709 Ops/s 383.7859 Ops/s $\color{#35bf28}+0.41\%$
test_redq_deprec_speed[True-backward] 4.6839ms 4.2244ms 236.7193 Ops/s 233.6382 Ops/s $\color{#35bf28}+1.32\%$
test_redq_deprec_speed[reduce-overhead-None] 3.6445ms 2.6022ms 384.2927 Ops/s 379.1015 Ops/s $\color{#35bf28}+1.37\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.3459ms 4.2258ms 236.6389 Ops/s 228.9641 Ops/s $\color{#35bf28}+3.35\%$
test_td3_speed[False-None] 8.0192ms 7.8505ms 127.3804 Ops/s 124.2217 Ops/s $\color{#35bf28}+2.54\%$
test_td3_speed[False-backward] 10.7567ms 10.1955ms 98.0825 Ops/s 96.4538 Ops/s $\color{#35bf28}+1.69\%$
test_td3_speed[True-None] 1.6216ms 1.5945ms 627.1428 Ops/s 607.6726 Ops/s $\color{#35bf28}+3.20\%$
test_td3_speed[True-backward] 3.2025ms 3.1066ms 321.8948 Ops/s 310.2099 Ops/s $\color{#35bf28}+3.77\%$
test_td3_speed[reduce-overhead-None] 48.4074ms 25.0449ms 39.9282 Ops/s 38.1651 Ops/s $\color{#35bf28}+4.62\%$
test_td3_speed[reduce-overhead-backward] 1.3881ms 1.3224ms 756.1971 Ops/s 721.9167 Ops/s $\color{#35bf28}+4.75\%$
test_cql_speed[False-None] 16.8819ms 16.4918ms 60.6362 Ops/s 59.4144 Ops/s $\color{#35bf28}+2.06\%$
test_cql_speed[False-backward] 22.2317ms 21.6774ms 46.1309 Ops/s 45.5356 Ops/s $\color{#35bf28}+1.31\%$
test_cql_speed[True-None] 3.5112ms 3.1982ms 312.6755 Ops/s 299.9572 Ops/s $\color{#35bf28}+4.24\%$
test_cql_speed[True-backward] 6.0237ms 5.3899ms 185.5320 Ops/s 178.3279 Ops/s $\color{#35bf28}+4.04\%$
test_cql_speed[reduce-overhead-None] 21.1368ms 13.0239ms 76.7818 Ops/s 79.4311 Ops/s $\color{#d91a1a}-3.34\%$
test_cql_speed[reduce-overhead-backward] 1.9085ms 1.8024ms 554.8071 Ops/s 537.8812 Ops/s $\color{#35bf28}+3.15\%$
test_a2c_speed[False-None] 3.3529ms 3.1591ms 316.5452 Ops/s 307.7458 Ops/s $\color{#35bf28}+2.86\%$
test_a2c_speed[False-backward] 6.7126ms 6.0538ms 165.1868 Ops/s 157.5194 Ops/s $\color{#35bf28}+4.87\%$
test_a2c_speed[True-None] 1.4685ms 1.3095ms 763.6225 Ops/s 755.4277 Ops/s $\color{#35bf28}+1.08\%$
test_a2c_speed[True-backward] 2.9996ms 2.8353ms 352.6986 Ops/s 323.5464 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_a2c_speed[reduce-overhead-None] 15.7883ms 8.9072ms 112.2693 Ops/s 113.5493 Ops/s $\color{#d91a1a}-1.13\%$
test_a2c_speed[reduce-overhead-backward] 1.5593ms 1.4313ms 698.6736 Ops/s 621.4511 Ops/s $\textbf{\color{#35bf28}+12.43\%}$
test_ppo_speed[False-None] 3.7626ms 3.6567ms 273.4729 Ops/s 268.0679 Ops/s $\color{#35bf28}+2.02\%$
test_ppo_speed[False-backward] 7.2487ms 6.8279ms 146.4587 Ops/s 140.1764 Ops/s $\color{#35bf28}+4.48\%$
test_ppo_speed[True-None] 1.4218ms 1.3601ms 735.2358 Ops/s 716.5480 Ops/s $\color{#35bf28}+2.61\%$
test_ppo_speed[True-backward] 3.1414ms 2.9904ms 334.3980 Ops/s 311.6674 Ops/s $\textbf{\color{#35bf28}+7.29\%}$
test_ppo_speed[reduce-overhead-None] 1.0605ms 0.9218ms 1.0848 KOps/s 1.0671 KOps/s $\color{#35bf28}+1.66\%$
test_ppo_speed[reduce-overhead-backward] 1.4631ms 1.3647ms 732.7628 Ops/s 632.4409 Ops/s $\textbf{\color{#35bf28}+15.86\%}$
test_reinforce_speed[False-None] 2.4060ms 2.2440ms 445.6359 Ops/s 437.2970 Ops/s $\color{#35bf28}+1.91\%$
test_reinforce_speed[False-backward] 3.5013ms 3.2750ms 305.3411 Ops/s 293.7212 Ops/s $\color{#35bf28}+3.96\%$
test_reinforce_speed[True-None] 1.3384ms 1.2450ms 803.2102 Ops/s 768.5798 Ops/s $\color{#35bf28}+4.51\%$
test_reinforce_speed[True-backward] 3.0802ms 2.8704ms 348.3856 Ops/s 320.7107 Ops/s $\textbf{\color{#35bf28}+8.63\%}$
test_reinforce_speed[reduce-overhead-None] 18.3631ms 9.8260ms 101.7710 Ops/s 102.5958 Ops/s $\color{#d91a1a}-0.80\%$
test_reinforce_speed[reduce-overhead-backward] 1.4931ms 1.4349ms 696.9340 Ops/s 611.1398 Ops/s $\textbf{\color{#35bf28}+14.04\%}$
test_iql_speed[False-None] 9.4887ms 9.0575ms 110.4062 Ops/s 107.0844 Ops/s $\color{#35bf28}+3.10\%$
test_iql_speed[False-backward] 13.1255ms 12.6684ms 78.9367 Ops/s 74.5929 Ops/s $\textbf{\color{#35bf28}+5.82\%}$
test_iql_speed[True-None] 2.3098ms 2.1650ms 461.8986 Ops/s 445.0208 Ops/s $\color{#35bf28}+3.79\%$
test_iql_speed[True-backward] 4.6897ms 4.6422ms 215.4130 Ops/s 205.3374 Ops/s $\color{#35bf28}+4.91\%$
test_iql_speed[reduce-overhead-None] 18.9426ms 10.9921ms 90.9746 Ops/s 92.0203 Ops/s $\color{#d91a1a}-1.14\%$
test_iql_speed[reduce-overhead-backward] 1.9864ms 1.8551ms 539.0523 Ops/s 471.2263 Ops/s $\textbf{\color{#35bf28}+14.39\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7939ms 6.1479ms 162.6578 Ops/s 161.7971 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4874ms 0.2877ms 3.4759 KOps/s 3.7467 KOps/s $\textbf{\color{#d91a1a}-7.23\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6563ms 0.2726ms 3.6683 KOps/s 3.4715 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1278ms 5.8426ms 171.1562 Ops/s 170.9057 Ops/s $\color{#35bf28}+0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7263ms 0.2557ms 3.9112 KOps/s 3.5886 KOps/s $\textbf{\color{#35bf28}+8.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5455ms 0.2640ms 3.7880 KOps/s 3.6448 KOps/s $\color{#35bf28}+3.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5948ms 1.2459ms 802.6225 Ops/s 783.4387 Ops/s $\color{#35bf28}+2.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7109ms 1.2582ms 794.8049 Ops/s 851.7470 Ops/s $\textbf{\color{#d91a1a}-6.69\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2866ms 6.0301ms 165.8336 Ops/s 164.8275 Ops/s $\color{#35bf28}+0.61\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0670ms 0.4700ms 2.1279 KOps/s 2.4761 KOps/s $\textbf{\color{#d91a1a}-14.06\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6064ms 0.3835ms 2.6076 KOps/s 2.5008 KOps/s $\color{#35bf28}+4.27\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0835ms 5.9080ms 169.2625 Ops/s 168.8265 Ops/s $\color{#35bf28}+0.26\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8320ms 0.2731ms 3.6621 KOps/s 3.4761 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5705ms 0.2391ms 4.1817 KOps/s 3.8683 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1601ms 5.8434ms 171.1346 Ops/s 170.8228 Ops/s $\color{#35bf28}+0.18\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.4166ms 0.2945ms 3.3952 KOps/s 3.7375 KOps/s $\textbf{\color{#d91a1a}-9.16\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4511ms 0.2374ms 4.2122 KOps/s 4.0845 KOps/s $\color{#35bf28}+3.13\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1822ms 5.9769ms 167.3107 Ops/s 163.9650 Ops/s $\color{#35bf28}+2.04\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1134ms 0.4002ms 2.4985 KOps/s 2.4179 KOps/s $\color{#35bf28}+3.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.5852ms 0.3799ms 2.6320 KOps/s 2.5792 KOps/s $\color{#35bf28}+2.05\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9357ms 5.3553ms 186.7312 Ops/s 180.3137 Ops/s $\color{#35bf28}+3.56\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.0148ms 1.9235ms 519.8859 Ops/s 439.2069 Ops/s $\textbf{\color{#35bf28}+18.37\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.0482ms 1.2534ms 797.8475 Ops/s 805.4720 Ops/s $\color{#d91a1a}-0.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.6257ms 5.4846ms 182.3276 Ops/s 181.7918 Ops/s $\color{#35bf28}+0.29\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.0516ms 1.9981ms 500.4660 Ops/s 443.1964 Ops/s $\textbf{\color{#35bf28}+12.92\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.5934ms 1.2672ms 789.1330 Ops/s 798.6153 Ops/s $\color{#d91a1a}-1.19\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4918s 15.3562ms 65.1202 Ops/s 31.8953 Ops/s $\textbf{\color{#35bf28}+104.17\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.9154ms 2.2161ms 451.2331 Ops/s 450.1347 Ops/s $\color{#35bf28}+0.24\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2319ms 1.3618ms 734.3269 Ops/s 721.8427 Ops/s $\color{#35bf28}+1.73\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4327ms 12.9799ms 77.0423 Ops/s 75.0682 Ops/s $\color{#35bf28}+2.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.2270ms 16.4833ms 60.6674 Ops/s 59.8183 Ops/s $\color{#35bf28}+1.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.1421ms 17.5773ms 56.8916 Ops/s 55.7404 Ops/s $\color{#35bf28}+2.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.5678ms 16.8330ms 59.4072 Ops/s 59.9706 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7103ms 17.3851ms 57.5205 Ops/s 56.3673 Ops/s $\color{#35bf28}+2.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.3188ms 18.1316ms 55.1525 Ops/s 55.4509 Ops/s $\color{#d91a1a}-0.54\%$

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks a mil!

kurtamohler added a commit to kurtamohler/torchrl that referenced this pull request Feb 6, 2025
ghstack-source-id: 9b5d08d
Pull Request resolved: pytorch#2757
[ghstack-poisoned]
Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@kurtamohler kurtamohler merged commit 631d908 into gh/kurtamohler/3/base Feb 6, 2025
71 of 75 checks passed
kurtamohler added a commit that referenced this pull request Feb 6, 2025
ghstack-source-id: 9567081
Pull Request resolved: #2757
@kurtamohler kurtamohler deleted the gh/kurtamohler/3/head branch February 6, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants