Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

kurtamohler
Copy link
Contributor

@kurtamohler kurtamohler commented Feb 13, 2025

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 13, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2787

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit bd597d0 with merge base f1c42e0 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kurtamohler added a commit that referenced this pull request Feb 13, 2025
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 13, 2025
@kurtamohler kurtamohler requested a review from vmoens February 13, 2025 02:28
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.5830s 0.4933s 2.0271 Ops/s 1.9677 Ops/s $\color{#35bf28}+3.02\%$
test_transformed 1.0421s 0.9502s 1.0524 Ops/s 1.0129 Ops/s $\color{#35bf28}+3.90\%$
test_serial 1.5943s 1.4964s 0.6683 Ops/s 0.6549 Ops/s $\color{#35bf28}+2.04\%$
test_parallel 1.3493s 1.2874s 0.7767 Ops/s 0.7711 Ops/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[True-True-True-True-True] 0.1865ms 30.5039μs 32.7827 KOps/s 33.1699 KOps/s $\color{#d91a1a}-1.17\%$
test_step_mdp_speed[True-True-True-True-False] 47.2280μs 18.1280μs 55.1633 KOps/s 54.9358 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[True-True-True-False-True] 62.8880μs 17.3439μs 57.6573 KOps/s 57.3854 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-True-True-False-False] 39.5140μs 10.0761μs 99.2446 KOps/s 98.5768 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-True-False-True-True] 78.2660μs 33.0178μs 30.2867 KOps/s 30.4452 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-True-False-True-False] 73.5290μs 20.1170μs 49.7092 KOps/s 50.1132 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-True-False-False-True] 48.4210μs 19.3618μs 51.6480 KOps/s 52.5084 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-True-False-False-False] 59.0000μs 12.1050μs 82.6103 KOps/s 82.7259 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-False-True-True-True] 77.3660μs 34.6866μs 28.8296 KOps/s 29.4618 KOps/s $\color{#d91a1a}-2.15\%$
test_step_mdp_speed[True-False-True-True-False] 76.2630μs 21.8638μs 45.7377 KOps/s 45.4935 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-False-True-False-True] 87.8950μs 19.3506μs 51.6780 KOps/s 51.5250 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-False-True-False-False] 33.9840μs 12.0272μs 83.1446 KOps/s 83.3799 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-False-True-True] 79.2190μs 36.1722μs 27.6455 KOps/s 27.7140 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-False-True-False] 64.6240μs 23.7047μs 42.1857 KOps/s 42.1650 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-False-False-True] 68.7160μs 20.9210μs 47.7989 KOps/s 47.5330 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-False-False-False] 66.2740μs 13.8346μs 72.2823 KOps/s 72.4841 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-True-True-True] 74.6100μs 34.7262μs 28.7967 KOps/s 28.9833 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-True-True-True-False] 73.9880μs 21.8032μs 45.8649 KOps/s 45.6901 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[False-True-True-False-True] 46.0160μs 21.9866μs 45.4823 KOps/s 45.1227 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[False-True-True-False-False] 67.6770μs 13.3374μs 74.9773 KOps/s 74.7284 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-True-False-True-True] 0.1126ms 36.6290μs 27.3008 KOps/s 27.6648 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-True-False-True-False] 72.7270μs 23.8092μs 42.0006 KOps/s 42.4820 KOps/s $\color{#d91a1a}-1.13\%$
test_step_mdp_speed[False-True-False-False-True] 2.5816ms 23.9063μs 41.8300 KOps/s 41.6370 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[False-True-False-False-False] 88.7030μs 15.1655μs 65.9390 KOps/s 65.2537 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-False-True-True-True] 96.8710μs 37.9491μs 26.3511 KOps/s 26.3435 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[False-False-True-True-False] 72.8770μs 25.6041μs 39.0562 KOps/s 39.1205 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-False-True-False-True] 61.3950μs 23.6859μs 42.2193 KOps/s 41.3174 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[False-False-True-False-False] 59.8330μs 15.3147μs 65.2969 KOps/s 64.1079 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[False-False-False-True-True] 91.0200μs 39.4477μs 25.3500 KOps/s 25.4782 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-False-False-True-False] 66.6050μs 27.2641μs 36.6783 KOps/s 36.6280 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[False-False-False-False-True] 78.8170μs 25.8232μs 38.7248 KOps/s 39.6351 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[False-False-False-False-False] 52.3780μs 17.1939μs 58.1601 KOps/s 58.7455 KOps/s $\color{#d91a1a}-1.00\%$
test_values[generalized_advantage_estimate-True-True] 9.9741ms 9.7234ms 102.8448 Ops/s 104.9289 Ops/s $\color{#d91a1a}-1.99\%$
test_values[vec_generalized_advantage_estimate-True-True] 28.8837ms 26.2231ms 38.1343 Ops/s 40.8656 Ops/s $\textbf{\color{#d91a1a}-6.68\%}$
test_values[td0_return_estimate-False-False] 0.2343ms 0.1763ms 5.6726 KOps/s 5.6068 KOps/s $\color{#35bf28}+1.17\%$
test_values[td1_return_estimate-False-False] 24.3977ms 24.0033ms 41.6609 Ops/s 42.0702 Ops/s $\color{#d91a1a}-0.97\%$
test_values[vec_td1_return_estimate-False-False] 28.6959ms 26.3733ms 37.9171 Ops/s 40.6988 Ops/s $\textbf{\color{#d91a1a}-6.83\%}$
test_values[td_lambda_return_estimate-True-False] 36.2957ms 34.7133ms 28.8074 Ops/s 29.2809 Ops/s $\color{#d91a1a}-1.62\%$
test_values[vec_td_lambda_return_estimate-True-False] 29.0364ms 26.1654ms 38.2184 Ops/s 40.8474 Ops/s $\textbf{\color{#d91a1a}-6.44\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.5913ms 8.4730ms 118.0225 Ops/s 116.6574 Ops/s $\color{#35bf28}+1.17\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3303ms 1.9545ms 511.6414 Ops/s 548.1438 Ops/s $\textbf{\color{#d91a1a}-6.66\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4499ms 0.3723ms 2.6860 KOps/s 2.6632 KOps/s $\color{#35bf28}+0.86\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.5158ms 46.5824ms 21.4673 Ops/s 22.1790 Ops/s $\color{#d91a1a}-3.21\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.6662ms 3.4262ms 291.8675 Ops/s 290.9526 Ops/s $\color{#35bf28}+0.31\%$
test_dqn_speed[False-None] 6.3265ms 1.4233ms 702.5906 Ops/s 692.7237 Ops/s $\color{#35bf28}+1.42\%$
test_dqn_speed[False-backward] 2.0303ms 1.9055ms 524.7917 Ops/s 515.6789 Ops/s $\color{#35bf28}+1.77\%$
test_dqn_speed[True-None] 0.7154ms 0.4870ms 2.0532 KOps/s 2.0164 KOps/s $\color{#35bf28}+1.82\%$
test_dqn_speed[True-backward] 0.9808ms 0.9105ms 1.0984 KOps/s 1.0886 KOps/s $\color{#35bf28}+0.89\%$
test_dqn_speed[reduce-overhead-None] 0.6957ms 0.4885ms 2.0469 KOps/s 2.0375 KOps/s $\color{#35bf28}+0.46\%$
test_dqn_speed[reduce-overhead-backward] 0.9758ms 0.9112ms 1.0974 KOps/s 1.0769 KOps/s $\color{#35bf28}+1.91\%$
test_ddpg_speed[False-None] 4.0770ms 2.9015ms 344.6479 Ops/s 335.1005 Ops/s $\color{#35bf28}+2.85\%$
test_ddpg_speed[False-backward] 4.2912ms 4.0727ms 245.5378 Ops/s 243.2720 Ops/s $\color{#35bf28}+0.93\%$
test_ddpg_speed[True-None] 1.3353ms 1.2285ms 814.0000 Ops/s 807.3020 Ops/s $\color{#35bf28}+0.83\%$
test_ddpg_speed[True-backward] 3.2088ms 2.2009ms 454.3495 Ops/s 468.2265 Ops/s $\color{#d91a1a}-2.96\%$
test_ddpg_speed[reduce-overhead-None] 1.8979ms 1.2408ms 805.9259 Ops/s 802.2679 Ops/s $\color{#35bf28}+0.46\%$
test_ddpg_speed[reduce-overhead-backward] 2.2250ms 2.1489ms 465.3568 Ops/s 469.5343 Ops/s $\color{#d91a1a}-0.89\%$
test_sac_speed[False-None] 8.6041ms 8.1186ms 123.1744 Ops/s 119.7769 Ops/s $\color{#35bf28}+2.84\%$
test_sac_speed[False-backward] 11.6512ms 11.0076ms 90.8461 Ops/s 89.7883 Ops/s $\color{#35bf28}+1.18\%$
test_sac_speed[True-None] 2.3559ms 2.1027ms 475.5706 Ops/s 472.6554 Ops/s $\color{#35bf28}+0.62\%$
test_sac_speed[True-backward] 3.8814ms 3.7937ms 263.5923 Ops/s 261.6225 Ops/s $\color{#35bf28}+0.75\%$
test_sac_speed[reduce-overhead-None] 2.9969ms 2.1011ms 475.9413 Ops/s 468.7190 Ops/s $\color{#35bf28}+1.54\%$
test_sac_speed[reduce-overhead-backward] 4.0301ms 3.8102ms 262.4531 Ops/s 260.2801 Ops/s $\color{#35bf28}+0.83\%$
test_redq_speed[False-None] 19.6075ms 13.8073ms 72.4253 Ops/s 74.6633 Ops/s $\color{#d91a1a}-3.00\%$
test_redq_speed[False-backward] 32.3607ms 23.2257ms 43.0557 Ops/s 44.1080 Ops/s $\color{#d91a1a}-2.39\%$
test_redq_speed[True-None] 5.7123ms 5.0327ms 198.7025 Ops/s 200.6347 Ops/s $\color{#d91a1a}-0.96\%$
test_redq_speed[True-backward] 13.3326ms 12.8689ms 77.7069 Ops/s 78.9524 Ops/s $\color{#d91a1a}-1.58\%$
test_redq_speed[reduce-overhead-None] 6.1006ms 5.1047ms 195.8979 Ops/s 199.2845 Ops/s $\color{#d91a1a}-1.70\%$
test_redq_speed[reduce-overhead-backward] 14.7151ms 13.0780ms 76.4640 Ops/s 77.0851 Ops/s $\color{#d91a1a}-0.81\%$
test_redq_deprec_speed[False-None] 14.3858ms 13.2477ms 75.4849 Ops/s 75.8682 Ops/s $\color{#d91a1a}-0.51\%$
test_redq_deprec_speed[False-backward] 20.7333ms 19.2731ms 51.8857 Ops/s 50.9165 Ops/s $\color{#35bf28}+1.90\%$
test_redq_deprec_speed[True-None] 4.4325ms 3.9216ms 255.0009 Ops/s 253.0756 Ops/s $\color{#35bf28}+0.76\%$
test_redq_deprec_speed[True-backward] 9.0387ms 8.5151ms 117.4390 Ops/s 118.7915 Ops/s $\color{#d91a1a}-1.14\%$
test_redq_deprec_speed[reduce-overhead-None] 4.4568ms 3.9690ms 251.9497 Ops/s 257.2719 Ops/s $\color{#d91a1a}-2.07\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.7216ms 9.1755ms 108.9864 Ops/s 116.6819 Ops/s $\textbf{\color{#d91a1a}-6.60\%}$
test_td3_speed[False-None] 8.8722ms 8.2203ms 121.6503 Ops/s 121.0619 Ops/s $\color{#35bf28}+0.49\%$
test_td3_speed[False-backward] 12.5773ms 10.9647ms 91.2014 Ops/s 93.3920 Ops/s $\color{#d91a1a}-2.35\%$
test_td3_speed[True-None] 2.0397ms 1.8404ms 543.3698 Ops/s 544.0160 Ops/s $\color{#d91a1a}-0.12\%$
test_td3_speed[True-backward] 4.3853ms 3.4898ms 286.5514 Ops/s 283.1948 Ops/s $\color{#35bf28}+1.19\%$
test_td3_speed[reduce-overhead-None] 1.9210ms 1.8356ms 544.7926 Ops/s 534.9060 Ops/s $\color{#35bf28}+1.85\%$
test_td3_speed[reduce-overhead-backward] 3.4692ms 3.4305ms 291.5032 Ops/s 280.8748 Ops/s $\color{#35bf28}+3.78\%$
test_cql_speed[False-None] 39.3663ms 36.8108ms 27.1660 Ops/s 26.1581 Ops/s $\color{#35bf28}+3.85\%$
test_cql_speed[False-backward] 50.9755ms 47.5473ms 21.0317 Ops/s 20.0433 Ops/s $\color{#35bf28}+4.93\%$
test_cql_speed[True-None] 17.6268ms 16.2345ms 61.5974 Ops/s 60.4811 Ops/s $\color{#35bf28}+1.85\%$
test_cql_speed[True-backward] 24.4978ms 23.3365ms 42.8513 Ops/s 42.7377 Ops/s $\color{#35bf28}+0.27\%$
test_cql_speed[reduce-overhead-None] 18.2119ms 16.4360ms 60.8420 Ops/s 61.0173 Ops/s $\color{#d91a1a}-0.29\%$
test_cql_speed[reduce-overhead-backward] 25.5934ms 23.3644ms 42.8002 Ops/s 43.0560 Ops/s $\color{#d91a1a}-0.59\%$
test_a2c_speed[False-None] 8.8462ms 7.3735ms 135.6217 Ops/s 136.7484 Ops/s $\color{#d91a1a}-0.82\%$
test_a2c_speed[False-backward] 18.0473ms 14.5802ms 68.5862 Ops/s 67.6589 Ops/s $\color{#35bf28}+1.37\%$
test_a2c_speed[True-None] 4.1593ms 3.7394ms 267.4197 Ops/s 266.1994 Ops/s $\color{#35bf28}+0.46\%$
test_a2c_speed[True-backward] 12.0155ms 10.4097ms 96.0641 Ops/s 97.5654 Ops/s $\color{#d91a1a}-1.54\%$
test_a2c_speed[reduce-overhead-None] 6.7768ms 3.8433ms 260.1927 Ops/s 264.6656 Ops/s $\color{#d91a1a}-1.69\%$
test_a2c_speed[reduce-overhead-backward] 10.8105ms 10.3553ms 96.5689 Ops/s 95.7858 Ops/s $\color{#35bf28}+0.82\%$
test_ppo_speed[False-None] 8.6158ms 7.5868ms 131.8084 Ops/s 130.0210 Ops/s $\color{#35bf28}+1.37\%$
test_ppo_speed[False-backward] 18.2752ms 15.2574ms 65.5421 Ops/s 66.4730 Ops/s $\color{#d91a1a}-1.40\%$
test_ppo_speed[True-None] 4.6900ms 4.1541ms 240.7252 Ops/s 241.4837 Ops/s $\color{#d91a1a}-0.31\%$
test_ppo_speed[True-backward] 11.8424ms 10.1764ms 98.2667 Ops/s 93.4624 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_ppo_speed[reduce-overhead-None] 5.7776ms 4.2045ms 237.8384 Ops/s 240.1751 Ops/s $\color{#d91a1a}-0.97\%$
test_ppo_speed[reduce-overhead-backward] 10.5493ms 10.0210ms 99.7904 Ops/s 98.9369 Ops/s $\color{#35bf28}+0.86\%$
test_reinforce_speed[False-None] 7.9309ms 6.6366ms 150.6794 Ops/s 150.3583 Ops/s $\color{#35bf28}+0.21\%$
test_reinforce_speed[False-backward] 11.2740ms 10.0766ms 99.2401 Ops/s 100.2697 Ops/s $\color{#d91a1a}-1.03\%$
test_reinforce_speed[True-None] 3.7698ms 3.0984ms 322.7439 Ops/s 318.0265 Ops/s $\color{#35bf28}+1.48\%$
test_reinforce_speed[True-backward] 11.1703ms 9.1482ms 109.3113 Ops/s 110.8410 Ops/s $\color{#d91a1a}-1.38\%$
test_reinforce_speed[reduce-overhead-None] 3.7630ms 3.1025ms 322.3239 Ops/s 318.6102 Ops/s $\color{#35bf28}+1.17\%$
test_reinforce_speed[reduce-overhead-backward] 11.1202ms 9.2785ms 107.7760 Ops/s 109.5042 Ops/s $\color{#d91a1a}-1.58\%$
test_iql_speed[False-None] 34.6457ms 32.8542ms 30.4375 Ops/s 29.3582 Ops/s $\color{#35bf28}+3.68\%$
test_iql_speed[False-backward] 48.5851ms 46.4745ms 21.5172 Ops/s 21.4470 Ops/s $\color{#35bf28}+0.33\%$
test_iql_speed[True-None] 12.4411ms 11.5120ms 86.8662 Ops/s 85.0703 Ops/s $\color{#35bf28}+2.11\%$
test_iql_speed[True-backward] 24.3134ms 22.6469ms 44.1562 Ops/s 44.3001 Ops/s $\color{#d91a1a}-0.32\%$
test_iql_speed[reduce-overhead-None] 13.9896ms 11.9488ms 83.6907 Ops/s 85.9196 Ops/s $\color{#d91a1a}-2.59\%$
test_iql_speed[reduce-overhead-backward] 25.3393ms 22.9522ms 43.5689 Ops/s 41.9399 Ops/s $\color{#35bf28}+3.88\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5995ms 4.9845ms 200.6229 Ops/s 200.7860 Ops/s $\color{#d91a1a}-0.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7862ms 0.5142ms 1.9446 KOps/s 1.8968 KOps/s $\color{#35bf28}+2.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9524ms 0.4942ms 2.0235 KOps/s 1.9918 KOps/s $\color{#35bf28}+1.59\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1322ms 4.7667ms 209.7900 Ops/s 202.5686 Ops/s $\color{#35bf28}+3.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3225ms 0.5154ms 1.9404 KOps/s 1.9234 KOps/s $\color{#35bf28}+0.88\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7788ms 0.4874ms 2.0517 KOps/s 2.0289 KOps/s $\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5980ms 1.6769ms 596.3229 Ops/s 593.8505 Ops/s $\color{#35bf28}+0.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3231ms 1.5873ms 629.9999 Ops/s 628.2956 Ops/s $\color{#35bf28}+0.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2236ms 4.8793ms 204.9457 Ops/s 204.4006 Ops/s $\color{#35bf28}+0.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.5154ms 0.6583ms 1.5192 KOps/s 1.5071 KOps/s $\color{#35bf28}+0.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9078ms 0.6282ms 1.5918 KOps/s 1.5790 KOps/s $\color{#35bf28}+0.82\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9325ms 4.7142ms 212.1256 Ops/s 211.6838 Ops/s $\color{#35bf28}+0.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5936s 1.3108ms 762.8812 Ops/s 1.9041 KOps/s $\textbf{\color{#d91a1a}-59.94\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8294ms 0.4987ms 2.0053 KOps/s 2.0281 KOps/s $\color{#d91a1a}-1.13\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.6129ms 4.8093ms 207.9300 Ops/s 207.0911 Ops/s $\color{#35bf28}+0.41\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.2417ms 0.5118ms 1.9539 KOps/s 403.8335 Ops/s $\textbf{\color{#35bf28}+383.85\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8271ms 0.4921ms 2.0321 KOps/s 2.0267 KOps/s $\color{#35bf28}+0.26\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3356ms 5.0547ms 197.8352 Ops/s 194.7442 Ops/s $\color{#35bf28}+1.59\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 4.0075ms 0.6844ms 1.4612 KOps/s 1.4809 KOps/s $\color{#d91a1a}-1.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8970ms 0.6464ms 1.5470 KOps/s 1.5701 KOps/s $\color{#d91a1a}-1.47\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 11.2274ms 4.5021ms 222.1182 Ops/s 225.5556 Ops/s $\color{#d91a1a}-1.52\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.6154ms 2.2790ms 438.7873 Ops/s 375.9610 Ops/s $\textbf{\color{#35bf28}+16.71\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.1008ms 1.3985ms 715.0510 Ops/s 747.5220 Ops/s $\color{#d91a1a}-4.34\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4686s 13.7192ms 72.8905 Ops/s 225.2960 Ops/s $\textbf{\color{#d91a1a}-67.65\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.8518ms 2.4806ms 403.1319 Ops/s 407.7027 Ops/s $\color{#d91a1a}-1.12\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.4673ms 1.3024ms 767.7915 Ops/s 764.4286 Ops/s $\color{#35bf28}+0.44\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9747ms 4.5235ms 221.0694 Ops/s 215.0930 Ops/s $\color{#35bf28}+2.78\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.0061ms 2.5344ms 394.5766 Ops/s 348.1160 Ops/s $\textbf{\color{#35bf28}+13.35\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.7594ms 1.4486ms 690.3306 Ops/s 480.7767 Ops/s $\textbf{\color{#35bf28}+43.59\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.8795ms 12.3985ms 80.6547 Ops/s 80.9854 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.7152ms 14.6464ms 68.2759 Ops/s 68.7574 Ops/s $\color{#d91a1a}-0.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.2867ms 20.8984ms 47.8505 Ops/s 46.9268 Ops/s $\color{#35bf28}+1.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.9183ms 14.7401ms 67.8420 Ops/s 68.5526 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.8966ms 20.8922ms 47.8648 Ops/s 47.2884 Ops/s $\color{#35bf28}+1.22\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.4914ms 16.0229ms 62.4107 Ops/s 61.6849 Ops/s $\color{#35bf28}+1.18\%$

@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.8589s 0.7739s 1.2922 Ops/s 1.2736 Ops/s $\color{#35bf28}+1.46\%$
test_transformed 1.4223s 1.3375s 0.7477 Ops/s 0.7272 Ops/s $\color{#35bf28}+2.81\%$
test_serial 2.3132s 2.2257s 0.4493 Ops/s 0.4429 Ops/s $\color{#35bf28}+1.44\%$
test_parallel 1.9336s 1.8336s 0.5454 Ops/s 0.5496 Ops/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[True-True-True-True-True] 0.1909ms 38.9196μs 25.6940 KOps/s 25.3082 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[True-True-True-True-False] 0.1286ms 22.6752μs 44.1011 KOps/s 44.2237 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-True-True-False-True] 0.1069ms 21.6780μs 46.1297 KOps/s 45.9984 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-True-True-False-False] 0.1549ms 12.5405μs 79.7415 KOps/s 80.2522 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-True-False-True-True] 81.8310μs 41.4154μs 24.1456 KOps/s 24.5204 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[True-True-False-True-False] 0.1003ms 24.9328μs 40.1078 KOps/s 41.4823 KOps/s $\color{#d91a1a}-3.31\%$
test_step_mdp_speed[True-True-False-False-True] 61.8810μs 24.1024μs 41.4897 KOps/s 41.7535 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[True-True-False-False-False] 96.3820μs 14.8907μs 67.1559 KOps/s 67.8690 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[True-False-True-True-True] 0.1400ms 43.6603μs 22.9041 KOps/s 23.7099 KOps/s $\color{#d91a1a}-3.40\%$
test_step_mdp_speed[True-False-True-True-False] 0.1043ms 26.7277μs 37.4144 KOps/s 37.0090 KOps/s $\color{#35bf28}+1.10\%$
test_step_mdp_speed[True-False-True-False-True] 55.9110μs 23.6250μs 42.3281 KOps/s 42.1184 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-True-False-False] 46.6410μs 14.7163μs 67.9520 KOps/s 68.0914 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[True-False-False-True-True] 73.9120μs 45.4983μs 21.9788 KOps/s 21.8044 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-False-False-True-False] 60.6310μs 29.2266μs 34.2154 KOps/s 34.6301 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-False-False-False-True] 58.0010μs 25.8183μs 38.7323 KOps/s 40.3684 KOps/s $\color{#d91a1a}-4.05\%$
test_step_mdp_speed[True-False-False-False-False] 0.1430ms 16.8816μs 59.2361 KOps/s 59.8438 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-True-True-True-True] 0.2409ms 43.0624μs 23.2221 KOps/s 23.2071 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[False-True-True-True-False] 0.2002ms 26.6465μs 37.5283 KOps/s 37.6622 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[False-True-True-False-True] 0.2210ms 27.3708μs 36.5353 KOps/s 37.3572 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[False-True-True-False-False] 0.1936ms 16.5106μs 60.5672 KOps/s 62.9297 KOps/s $\color{#d91a1a}-3.75\%$
test_step_mdp_speed[False-True-False-True-True] 0.2405ms 45.0992μs 22.1733 KOps/s 22.4083 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-True-False-True-False] 0.2318ms 28.8159μs 34.7031 KOps/s 34.8209 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[False-True-False-False-True] 3.5427ms 30.4874μs 32.8004 KOps/s 34.0188 KOps/s $\color{#d91a1a}-3.58\%$
test_step_mdp_speed[False-True-False-False-False] 89.7820μs 18.8720μs 52.9886 KOps/s 54.7058 KOps/s $\color{#d91a1a}-3.14\%$
test_step_mdp_speed[False-False-True-True-True] 84.9220μs 48.1443μs 20.7709 KOps/s 21.1995 KOps/s $\color{#d91a1a}-2.02\%$
test_step_mdp_speed[False-False-True-True-False] 59.7810μs 31.7463μs 31.4997 KOps/s 32.4965 KOps/s $\color{#d91a1a}-3.07\%$
test_step_mdp_speed[False-False-True-False-True] 70.8010μs 29.6409μs 33.7371 KOps/s 33.9443 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[False-False-True-False-False] 43.1110μs 18.5811μs 53.8181 KOps/s 53.4826 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-False-False-True-True] 88.5420μs 49.5671μs 20.1747 KOps/s 20.1829 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-False-True-False] 64.7110μs 33.6426μs 29.7242 KOps/s 29.9389 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-False-False-False-True] 54.7610μs 31.5874μs 31.6582 KOps/s 31.6155 KOps/s $\color{#35bf28}+0.14\%$
test_step_mdp_speed[False-False-False-False-False] 47.2810μs 20.6553μs 48.4137 KOps/s 49.0034 KOps/s $\color{#d91a1a}-1.20\%$
test_values[generalized_advantage_estimate-True-True] 25.0663ms 24.3786ms 41.0197 Ops/s 41.1957 Ops/s $\color{#d91a1a}-0.43\%$
test_values[vec_generalized_advantage_estimate-True-True] 99.6646ms 2.8885ms 346.2024 Ops/s 321.9734 Ops/s $\textbf{\color{#35bf28}+7.53\%}$
test_values[td0_return_estimate-False-False] 0.1038ms 77.9961μs 12.8212 KOps/s 12.1522 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_values[td1_return_estimate-False-False] 54.8520ms 54.3044ms 18.4147 Ops/s 18.4079 Ops/s $\color{#35bf28}+0.04\%$
test_values[vec_td1_return_estimate-False-False] 1.3523ms 1.0830ms 923.3879 Ops/s 928.5537 Ops/s $\color{#d91a1a}-0.56\%$
test_values[td_lambda_return_estimate-True-False] 89.0838ms 86.4769ms 11.5638 Ops/s 11.5286 Ops/s $\color{#35bf28}+0.31\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3822ms 1.0738ms 931.2964 Ops/s 931.7031 Ops/s $\color{#d91a1a}-0.04\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.6525ms 24.3867ms 41.0060 Ops/s 41.1765 Ops/s $\color{#d91a1a}-0.41\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0494ms 0.7412ms 1.3491 KOps/s 1.3401 KOps/s $\color{#35bf28}+0.68\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7970ms 0.6607ms 1.5135 KOps/s 1.5135 KOps/s $-0.00\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6314ms 1.4752ms 677.8617 Ops/s 676.4916 Ops/s $\color{#35bf28}+0.20\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8265ms 0.6769ms 1.4774 KOps/s 1.4769 KOps/s $\color{#35bf28}+0.03\%$
test_dqn_speed[False-None] 7.1428ms 1.5055ms 664.2347 Ops/s 657.4170 Ops/s $\color{#35bf28}+1.04\%$
test_dqn_speed[False-backward] 2.2781ms 2.1173ms 472.3012 Ops/s 469.7864 Ops/s $\color{#35bf28}+0.54\%$
test_dqn_speed[True-None] 0.7352ms 0.5378ms 1.8593 KOps/s 1.8333 KOps/s $\color{#35bf28}+1.42\%$
test_dqn_speed[True-backward] 1.2776ms 1.1890ms 841.0359 Ops/s 809.9163 Ops/s $\color{#35bf28}+3.84\%$
test_dqn_speed[reduce-overhead-None] 0.7393ms 0.5589ms 1.7892 KOps/s 1.7383 KOps/s $\color{#35bf28}+2.93\%$
test_dqn_speed[reduce-overhead-backward] 1.1879ms 1.0488ms 953.4256 Ops/s 914.1135 Ops/s $\color{#35bf28}+4.30\%$
test_ddpg_speed[False-None] 3.2002ms 2.8388ms 352.2672 Ops/s 335.1619 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_ddpg_speed[False-backward] 4.7228ms 4.2112ms 237.4633 Ops/s 229.3555 Ops/s $\color{#35bf28}+3.54\%$
test_ddpg_speed[True-None] 1.4911ms 1.3156ms 760.0953 Ops/s 752.6307 Ops/s $\color{#35bf28}+0.99\%$
test_ddpg_speed[True-backward] 2.6688ms 2.5147ms 397.6612 Ops/s 391.9904 Ops/s $\color{#35bf28}+1.45\%$
test_ddpg_speed[reduce-overhead-None] 1.5014ms 1.3250ms 754.7290 Ops/s 755.1085 Ops/s $\color{#d91a1a}-0.05\%$
test_ddpg_speed[reduce-overhead-backward] 2.1231ms 2.0001ms 499.9838 Ops/s 495.0104 Ops/s $\color{#35bf28}+1.00\%$
test_sac_speed[False-None] 8.2878ms 7.9226ms 126.2213 Ops/s 123.6137 Ops/s $\color{#35bf28}+2.11\%$
test_sac_speed[False-backward] 11.6468ms 11.0893ms 90.1774 Ops/s 88.8182 Ops/s $\color{#35bf28}+1.53\%$
test_sac_speed[True-None] 1.9982ms 1.8114ms 552.0738 Ops/s 547.0552 Ops/s $\color{#35bf28}+0.92\%$
test_sac_speed[True-backward] 3.9163ms 3.6910ms 270.9269 Ops/s 267.8515 Ops/s $\color{#35bf28}+1.15\%$
test_sac_speed[reduce-overhead-None] 17.6891ms 10.6787ms 93.6443 Ops/s 93.0745 Ops/s $\color{#35bf28}+0.61\%$
test_sac_speed[reduce-overhead-backward] 1.9246ms 1.7588ms 568.5734 Ops/s 603.0737 Ops/s $\textbf{\color{#d91a1a}-5.72\%}$
test_redq_speed[False-None] 7.7872ms 7.3886ms 135.3430 Ops/s 131.7809 Ops/s $\color{#35bf28}+2.70\%$
test_redq_speed[False-backward] 12.0442ms 11.5512ms 86.5709 Ops/s 86.4348 Ops/s $\color{#35bf28}+0.16\%$
test_redq_speed[True-None] 2.4933ms 2.2733ms 439.8884 Ops/s 434.7323 Ops/s $\color{#35bf28}+1.19\%$
test_redq_speed[True-backward] 4.5428ms 4.1166ms 242.9181 Ops/s 238.3505 Ops/s $\color{#35bf28}+1.92\%$
test_redq_speed[reduce-overhead-None] 2.4891ms 2.2929ms 436.1303 Ops/s 427.8396 Ops/s $\color{#35bf28}+1.94\%$
test_redq_speed[reduce-overhead-backward] 4.5743ms 4.1418ms 241.4430 Ops/s 241.5880 Ops/s $\color{#d91a1a}-0.06\%$
test_redq_deprec_speed[False-None] 9.4687ms 8.9341ms 111.9310 Ops/s 107.2102 Ops/s $\color{#35bf28}+4.40\%$
test_redq_deprec_speed[False-backward] 12.7226ms 12.1835ms 82.0782 Ops/s 81.1038 Ops/s $\color{#35bf28}+1.20\%$
test_redq_deprec_speed[True-None] 2.7599ms 2.5876ms 386.4631 Ops/s 377.6766 Ops/s $\color{#35bf28}+2.33\%$
test_redq_deprec_speed[True-backward] 4.8230ms 4.4067ms 226.9285 Ops/s 223.2311 Ops/s $\color{#35bf28}+1.66\%$
test_redq_deprec_speed[reduce-overhead-None] 2.8548ms 2.6081ms 383.4241 Ops/s 367.8417 Ops/s $\color{#35bf28}+4.24\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.5969ms 4.3831ms 228.1504 Ops/s 229.7685 Ops/s $\color{#d91a1a}-0.70\%$
test_td3_speed[False-None] 7.8963ms 7.8643ms 127.1573 Ops/s 124.8352 Ops/s $\color{#35bf28}+1.86\%$
test_td3_speed[False-backward] 10.9569ms 10.4428ms 95.7597 Ops/s 96.6993 Ops/s $\color{#d91a1a}-0.97\%$
test_td3_speed[True-None] 1.7334ms 1.6510ms 605.6931 Ops/s 581.6728 Ops/s $\color{#35bf28}+4.13\%$
test_td3_speed[True-backward] 3.4073ms 3.2732ms 305.5076 Ops/s 296.0147 Ops/s $\color{#35bf28}+3.21\%$
test_td3_speed[reduce-overhead-None] 68.0986ms 25.8198ms 38.7299 Ops/s 38.8227 Ops/s $\color{#d91a1a}-0.24\%$
test_td3_speed[reduce-overhead-backward] 1.5102ms 1.4698ms 680.3729 Ops/s 737.3897 Ops/s $\textbf{\color{#d91a1a}-7.73\%}$
test_cql_speed[False-None] 17.2935ms 16.6064ms 60.2176 Ops/s 58.7383 Ops/s $\color{#35bf28}+2.52\%$
test_cql_speed[False-backward] 22.5053ms 22.0304ms 45.3917 Ops/s 45.6052 Ops/s $\color{#d91a1a}-0.47\%$
test_cql_speed[True-None] 3.4006ms 3.1939ms 313.1003 Ops/s 310.5761 Ops/s $\color{#35bf28}+0.81\%$
test_cql_speed[True-backward] 5.9794ms 5.5538ms 180.0565 Ops/s 178.6357 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[reduce-overhead-None] 19.0245ms 12.7571ms 78.3880 Ops/s 78.3696 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[reduce-overhead-backward] 2.1283ms 1.9741ms 506.5536 Ops/s 499.8417 Ops/s $\color{#35bf28}+1.34\%$
test_a2c_speed[False-None] 3.5493ms 3.1746ms 315.0036 Ops/s 306.7263 Ops/s $\color{#35bf28}+2.70\%$
test_a2c_speed[False-backward] 6.8314ms 6.3013ms 158.6968 Ops/s 156.2467 Ops/s $\color{#35bf28}+1.57\%$
test_a2c_speed[True-None] 1.4790ms 1.3320ms 750.7742 Ops/s 749.1715 Ops/s $\color{#35bf28}+0.21\%$
test_a2c_speed[True-backward] 3.1197ms 2.9931ms 334.0981 Ops/s 342.1603 Ops/s $\color{#d91a1a}-2.36\%$
test_a2c_speed[reduce-overhead-None] 14.1599ms 8.3311ms 120.0316 Ops/s 123.0065 Ops/s $\color{#d91a1a}-2.42\%$
test_a2c_speed[reduce-overhead-backward] 1.7441ms 1.5809ms 632.5605 Ops/s 690.7179 Ops/s $\textbf{\color{#d91a1a}-8.42\%}$
test_ppo_speed[False-None] 3.8648ms 3.6554ms 273.5680 Ops/s 265.7627 Ops/s $\color{#35bf28}+2.94\%$
test_ppo_speed[False-backward] 7.3955ms 7.0423ms 141.9995 Ops/s 145.7623 Ops/s $\color{#d91a1a}-2.58\%$
test_ppo_speed[True-None] 1.5398ms 1.3859ms 721.5550 Ops/s 709.8527 Ops/s $\color{#35bf28}+1.65\%$
test_ppo_speed[True-backward] 3.1991ms 2.9972ms 333.6419 Ops/s 325.5358 Ops/s $\color{#35bf28}+2.49\%$
test_ppo_speed[reduce-overhead-None] 1.1131ms 0.9515ms 1.0510 KOps/s 1.0603 KOps/s $\color{#d91a1a}-0.88\%$
test_ppo_speed[reduce-overhead-backward] 1.5177ms 1.3851ms 721.9569 Ops/s 696.7763 Ops/s $\color{#35bf28}+3.61\%$
test_reinforce_speed[False-None] 2.4161ms 2.2388ms 446.6756 Ops/s 434.6670 Ops/s $\color{#35bf28}+2.76\%$
test_reinforce_speed[False-backward] 3.6496ms 3.2437ms 308.2861 Ops/s 301.0513 Ops/s $\color{#35bf28}+2.40\%$
test_reinforce_speed[True-None] 1.4151ms 1.2664ms 789.6280 Ops/s 773.2985 Ops/s $\color{#35bf28}+2.11\%$
test_reinforce_speed[True-backward] 3.0358ms 2.8647ms 349.0715 Ops/s 332.3117 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_reinforce_speed[reduce-overhead-None] 15.6941ms 9.0354ms 110.6762 Ops/s 111.5761 Ops/s $\color{#d91a1a}-0.81\%$
test_reinforce_speed[reduce-overhead-backward] 1.6033ms 1.4806ms 675.3892 Ops/s 627.6277 Ops/s $\textbf{\color{#35bf28}+7.61\%}$
test_iql_speed[False-None] 9.5114ms 9.0754ms 110.1877 Ops/s 107.3792 Ops/s $\color{#35bf28}+2.62\%$
test_iql_speed[False-backward] 13.2074ms 12.6771ms 78.8823 Ops/s 75.4564 Ops/s $\color{#35bf28}+4.54\%$
test_iql_speed[True-None] 2.8936ms 2.1913ms 456.3489 Ops/s 440.4658 Ops/s $\color{#35bf28}+3.61\%$
test_iql_speed[True-backward] 4.8571ms 4.6325ms 215.8668 Ops/s 205.9661 Ops/s $\color{#35bf28}+4.81\%$
test_iql_speed[reduce-overhead-None] 0.4776s 12.3487ms 80.9802 Ops/s 97.6139 Ops/s $\textbf{\color{#d91a1a}-17.04\%}$
test_iql_speed[reduce-overhead-backward] 2.0425ms 1.8806ms 531.7575 Ops/s 509.7182 Ops/s $\color{#35bf28}+4.32\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.7402ms 6.1021ms 163.8767 Ops/s 160.4808 Ops/s $\color{#35bf28}+2.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6730ms 0.3255ms 3.0722 KOps/s 3.3819 KOps/s $\textbf{\color{#d91a1a}-9.16\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5410ms 0.2904ms 3.4430 KOps/s 3.2155 KOps/s $\textbf{\color{#35bf28}+7.07\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2148ms 5.8096ms 172.1283 Ops/s 168.4005 Ops/s $\color{#35bf28}+2.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8276ms 0.2864ms 3.4913 KOps/s 3.3423 KOps/s $\color{#35bf28}+4.46\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5322ms 0.2530ms 3.9529 KOps/s 3.6568 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4816ms 1.2598ms 793.8044 Ops/s 751.2982 Ops/s $\textbf{\color{#35bf28}+5.66\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3793ms 1.1723ms 853.0437 Ops/s 815.0142 Ops/s $\color{#35bf28}+4.67\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.2524ms 6.0239ms 166.0042 Ops/s 164.3396 Ops/s $\color{#35bf28}+1.01\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0032ms 0.4235ms 2.3613 KOps/s 2.0243 KOps/s $\textbf{\color{#35bf28}+16.65\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6990ms 0.4802ms 2.0826 KOps/s 2.1433 KOps/s $\color{#d91a1a}-2.83\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1665ms 5.8394ms 171.2511 Ops/s 167.8420 Ops/s $\color{#35bf28}+2.03\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9940ms 0.3557ms 2.8110 KOps/s 2.7122 KOps/s $\color{#35bf28}+3.64\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5415ms 0.3410ms 2.9323 KOps/s 2.7509 KOps/s $\textbf{\color{#35bf28}+6.59\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1095ms 5.7716ms 173.2635 Ops/s 168.2143 Ops/s $\color{#35bf28}+3.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9781ms 0.3124ms 3.2012 KOps/s 3.7373 KOps/s $\textbf{\color{#d91a1a}-14.34\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5974ms 0.2755ms 3.6296 KOps/s 4.0626 KOps/s $\textbf{\color{#d91a1a}-10.66\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1501ms 5.9725ms 167.4332 Ops/s 163.1006 Ops/s $\color{#35bf28}+2.66\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9751ms 0.4123ms 2.4254 KOps/s 1.9054 KOps/s $\textbf{\color{#35bf28}+27.29\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7728ms 0.3900ms 2.5644 KOps/s 2.4701 KOps/s $\color{#35bf28}+3.82\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0124ms 5.4172ms 184.5981 Ops/s 182.2807 Ops/s $\color{#35bf28}+1.27\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.6593ms 2.2241ms 449.6153 Ops/s 430.0342 Ops/s $\color{#35bf28}+4.55\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.4327ms 1.1722ms 853.0828 Ops/s 796.4360 Ops/s $\textbf{\color{#35bf28}+7.11\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4501s 14.4170ms 69.3623 Ops/s 183.3953 Ops/s $\textbf{\color{#d91a1a}-62.18\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.8988ms 1.6742ms 597.3099 Ops/s 437.4779 Ops/s $\textbf{\color{#35bf28}+36.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 9.1507ms 1.2624ms 792.1540 Ops/s 846.9881 Ops/s $\textbf{\color{#d91a1a}-6.47\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.8893ms 5.6298ms 177.6250 Ops/s 30.9301 Ops/s $\textbf{\color{#35bf28}+474.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.3304ms 2.1863ms 457.3872 Ops/s 435.5624 Ops/s $\textbf{\color{#35bf28}+5.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.3636ms 1.4467ms 691.2122 Ops/s 738.7040 Ops/s $\textbf{\color{#d91a1a}-6.43\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.3645ms 12.8579ms 77.7732 Ops/s 74.2884 Ops/s $\color{#35bf28}+4.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 18.5269ms 16.7241ms 59.7939 Ops/s 60.4126 Ops/s $\color{#d91a1a}-1.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.7008ms 17.4739ms 57.2283 Ops/s 55.7278 Ops/s $\color{#35bf28}+2.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.3180ms 16.7335ms 59.7603 Ops/s 60.1253 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.6093ms 17.3401ms 57.6699 Ops/s 56.1885 Ops/s $\color{#35bf28}+2.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 0.3924s 25.3469ms 39.4525 Ops/s 55.7714 Ops/s $\textbf{\color{#d91a1a}-29.26\%}$

Copy link
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@kurtamohler kurtamohler merged commit bd597d0 into gh/kurtamohler/4/base Feb 13, 2025
72 of 75 checks passed
kurtamohler added a commit that referenced this pull request Feb 13, 2025
@kurtamohler kurtamohler deleted the gh/kurtamohler/4/head branch February 13, 2025 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants