[Feature] Add SGLang backend support to GRPO#3437
Open
vmoens wants to merge 28 commits intogh/vmoens/217/basefrom
Open
[Feature] Add SGLang backend support to GRPO#3437vmoens wants to merge 28 commits intogh/vmoens/217/basefrom
vmoens wants to merge 28 commits intogh/vmoens/217/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3437
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 1 PendingAs of commit 58f3bfd with merge base 7a0b1f9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: d4a7d42
Pull-Request: #3437
This was referenced Jan 31, 2026
This was referenced Feb 2, 2026
This was referenced Feb 2, 2026
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: d4a7d42
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: d4a7d42
Pull-Request: #3437
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 79.7279μs | 78.9204μs | 12.6710 KOps/s | 12.4432 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1374ms | 0.1365ms | 7.3257 KOps/s | 6.9896 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 99.3632ms | 99.1173ms | 10.0891 Ops/s | 9.9511 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5267μs | 2.5178μs | 397.1696 KOps/s | 401.3456 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.4448μs | 35.7343μs | 27.9843 KOps/s | 27.8806 KOps/s | |
| test_simple | 0.6465s | 0.5525s | 1.8100 Ops/s | 1.8143 Ops/s | |
| test_transformed | 1.2142s | 1.1210s | 0.8921 Ops/s | 0.8881 Ops/s | |
| test_serial | 1.6147s | 1.6119s | 0.6204 Ops/s | 0.6180 Ops/s | |
| test_parallel | 1.1719s | 1.0882s | 0.9189 Ops/s | 0.8918 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1368ms | 42.8734μs | 23.3245 KOps/s | 22.9722 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 47.2210μs | 24.7991μs | 40.3240 KOps/s | 40.7477 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 53.9010μs | 24.4272μs | 40.9381 KOps/s | 40.6437 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 43.1900μs | 13.3599μs | 74.8509 KOps/s | 73.2902 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 70.9810μs | 45.9791μs | 21.7490 KOps/s | 21.5313 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 54.4210μs | 26.9548μs | 37.0991 KOps/s | 36.4718 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 61.0910μs | 27.0364μs | 36.9871 KOps/s | 36.7787 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 44.9210μs | 16.0130μs | 62.4494 KOps/s | 61.9833 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 84.2210μs | 49.1327μs | 20.3530 KOps/s | 20.0122 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 54.3900μs | 29.5088μs | 33.8882 KOps/s | 33.2746 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 60.7210μs | 26.7829μs | 37.3372 KOps/s | 37.1462 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 39.2710μs | 16.1264μs | 62.0101 KOps/s | 61.6146 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 88.7410μs | 51.5487μs | 19.3991 KOps/s | 19.3300 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 68.2810μs | 32.3322μs | 30.9289 KOps/s | 31.0619 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 65.5610μs | 29.2107μs | 34.2341 KOps/s | 33.7179 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 49.3800μs | 18.6834μs | 53.5235 KOps/s | 53.6520 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 82.0420μs | 48.8105μs | 20.4874 KOps/s | 20.5139 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 57.1710μs | 30.0349μs | 33.2946 KOps/s | 33.2924 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 59.0210μs | 30.6643μs | 32.6112 KOps/s | 32.2843 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 46.3810μs | 18.1216μs | 55.1828 KOps/s | 55.4546 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 2.8492ms | 51.7865μs | 19.3101 KOps/s | 19.4123 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 60.5010μs | 32.8667μs | 30.4260 KOps/s | 30.5900 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 61.6410μs | 32.8341μs | 30.4562 KOps/s | 29.7973 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 53.2510μs | 20.5876μs | 48.5729 KOps/s | 48.7214 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 82.7020μs | 54.3186μs | 18.4099 KOps/s | 18.5001 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 59.1610μs | 34.7519μs | 28.7754 KOps/s | 28.1799 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 64.7010μs | 33.2622μs | 30.0642 KOps/s | 29.4514 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 44.5410μs | 20.5007μs | 48.7788 KOps/s | 48.3979 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 83.6610μs | 56.0269μs | 17.8486 KOps/s | 17.6412 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 67.9410μs | 37.2100μs | 26.8745 KOps/s | 26.3548 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 65.7410μs | 35.3742μs | 28.2692 KOps/s | 28.0871 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 50.8810μs | 22.8865μs | 43.6938 KOps/s | 43.0949 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8370s | 0.7478s | 1.3373 Ops/s | 1.3109 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7068s | 0.6175s | 1.6193 Ops/s | 1.5916 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7035s | 1.6319s | 0.6128 Ops/s | 0.6022 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4923s | 1.4153s | 0.7066 Ops/s | 0.7028 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9455s | 1.8703s | 0.5347 Ops/s | 0.5304 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7316s | 1.6538s | 0.6047 Ops/s | 0.5970 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6215s | 4.5188s | 0.2213 Ops/s | 0.2171 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5503s | 4.3597s | 0.2294 Ops/s | 0.2292 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9836s | 1.9057s | 0.5247 Ops/s | 0.5191 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6933s | 1.6164s | 0.6187 Ops/s | 0.6089 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 9.8502ms | 9.5480ms | 104.7339 Ops/s | 104.0231 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 21.4078ms | 17.5304ms | 57.0439 Ops/s | 91.3283 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2334ms | 0.1237ms | 8.0813 KOps/s | 7.9157 KOps/s | |
| test_values[td1_return_estimate-False-False] | 25.8653ms | 25.4993ms | 39.2168 Ops/s | 39.1256 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 21.0469ms | 17.5566ms | 56.9588 Ops/s | 89.6804 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 41.6441ms | 37.6824ms | 26.5376 Ops/s | 26.5613 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.4335ms | 17.5099ms | 57.1107 Ops/s | 90.8051 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.5783ms | 8.4045ms | 118.9836 Ops/s | 118.7691 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 6.4098ms | 1.4442ms | 692.4358 Ops/s | 689.4044 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4987ms | 0.3976ms | 2.5152 KOps/s | 2.5503 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 34.4149ms | 33.7281ms | 29.6488 Ops/s | 34.3946 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.1372ms | 1.6881ms | 592.3718 Ops/s | 587.3560 Ops/s | |
| test_dqn_speed[False-None] | 1.7749ms | 1.3545ms | 738.2915 Ops/s | 737.1060 Ops/s | |
| test_dqn_speed[False-backward] | 1.9015ms | 1.8504ms | 540.4185 Ops/s | 544.3134 Ops/s | |
| test_dqn_speed[True-None] | 0.6645ms | 0.5323ms | 1.8787 KOps/s | 1.9058 KOps/s | |
| test_dqn_speed[True-backward] | 0.9931ms | 0.9535ms | 1.0488 KOps/s | 925.0190 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6321ms | 0.5101ms | 1.9604 KOps/s | 1.8884 KOps/s | |
| test_ddpg_speed[False-None] | 3.0931ms | 2.7545ms | 363.0358 Ops/s | 360.9001 Ops/s | |
| test_ddpg_speed[False-backward] | 4.0747ms | 3.9342ms | 254.1821 Ops/s | 254.0566 Ops/s | |
| test_ddpg_speed[True-None] | 1.4081ms | 1.3420ms | 745.1509 Ops/s | 735.2747 Ops/s | |
| test_ddpg_speed[True-backward] | 2.3851ms | 2.2854ms | 437.5546 Ops/s | 387.8506 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.5288ms | 1.3470ms | 742.4139 Ops/s | 730.1286 Ops/s | |
| test_sac_speed[False-None] | 8.1077ms | 7.6428ms | 130.8420 Ops/s | 128.8739 Ops/s | |
| test_sac_speed[False-backward] | 11.1555ms | 10.7267ms | 93.2253 Ops/s | 91.8994 Ops/s | |
| test_sac_speed[True-None] | 2.4264ms | 2.0671ms | 483.7794 Ops/s | 480.0232 Ops/s | |
| test_sac_speed[True-backward] | 4.0429ms | 3.8972ms | 256.5935 Ops/s | 229.0440 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.1917ms | 2.0604ms | 485.3498 Ops/s | 475.5584 Ops/s | |
| test_redq_speed[False-None] | 14.6589ms | 10.2299ms | 97.7527 Ops/s | 99.5794 Ops/s | |
| test_redq_speed[False-backward] | 17.9311ms | 17.1798ms | 58.2077 Ops/s | 58.4740 Ops/s | |
| test_redq_speed[True-None] | 4.5471ms | 4.3131ms | 231.8514 Ops/s | 220.9552 Ops/s | |
| test_redq_speed[True-backward] | 9.7482ms | 9.2992ms | 107.5362 Ops/s | 103.4088 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.6539ms | 4.3609ms | 229.3090 Ops/s | 232.9872 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.2520ms | 10.7666ms | 92.8797 Ops/s | 92.9325 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.0982ms | 15.4942ms | 64.5403 Ops/s | 64.5423 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.9697ms | 3.5926ms | 278.3499 Ops/s | 286.5845 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.5646ms | 7.1658ms | 139.5513 Ops/s | 130.6530 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.6587ms | 3.4959ms | 286.0468 Ops/s | 280.5406 Ops/s | |
| test_td3_speed[False-None] | 8.0159ms | 7.7705ms | 128.6923 Ops/s | 127.1730 Ops/s | |
| test_td3_speed[False-backward] | 11.3013ms | 10.6422ms | 93.9656 Ops/s | 94.2590 Ops/s | |
| test_td3_speed[True-None] | 1.8956ms | 1.7932ms | 557.6628 Ops/s | 544.8542 Ops/s | |
| test_td3_speed[True-backward] | 3.7485ms | 3.5146ms | 284.5302 Ops/s | 282.2548 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.7898ms | 1.7332ms | 576.9557 Ops/s | 572.7447 Ops/s | |
| test_cql_speed[False-None] | 27.7961ms | 25.4136ms | 39.3491 Ops/s | 38.6423 Ops/s | |
| test_cql_speed[False-backward] | 35.0699ms | 34.3366ms | 29.1234 Ops/s | 28.7742 Ops/s | |
| test_cql_speed[True-None] | 12.3066ms | 12.0257ms | 83.1552 Ops/s | 81.8364 Ops/s | |
| test_cql_speed[True-backward] | 18.3605ms | 17.7605ms | 56.3048 Ops/s | 55.0787 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.4298ms | 12.0738ms | 82.8241 Ops/s | 80.9881 Ops/s | |
| test_a2c_speed[False-None] | 5.5110ms | 5.2505ms | 190.4574 Ops/s | 187.4382 Ops/s | |
| test_a2c_speed[False-backward] | 11.8828ms | 11.5146ms | 86.8461 Ops/s | 86.0987 Ops/s | |
| test_a2c_speed[True-None] | 3.7840ms | 3.6461ms | 274.2653 Ops/s | 265.6002 Ops/s | |
| test_a2c_speed[True-backward] | 8.6610ms | 8.3994ms | 119.0560 Ops/s | 119.1768 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 3.9213ms | 3.6250ms | 275.8629 Ops/s | 272.1102 Ops/s | |
| test_ppo_speed[False-None] | 6.1742ms | 5.7432ms | 174.1188 Ops/s | 172.3916 Ops/s | |
| test_ppo_speed[False-backward] | 12.5446ms | 12.1308ms | 82.4346 Ops/s | 81.6224 Ops/s | |
| test_ppo_speed[True-None] | 3.7046ms | 3.5538ms | 281.3891 Ops/s | 270.6564 Ops/s | |
| test_ppo_speed[True-backward] | 8.4809ms | 8.1974ms | 121.9896 Ops/s | 120.7615 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.6685ms | 3.5080ms | 285.0648 Ops/s | 281.4205 Ops/s | |
| test_reinforce_speed[False-None] | 4.7041ms | 4.3917ms | 227.7023 Ops/s | 221.8298 Ops/s | |
| test_reinforce_speed[False-backward] | 7.3789ms | 7.1251ms | 140.3480 Ops/s | 138.6210 Ops/s | |
| test_reinforce_speed[True-None] | 2.9248ms | 2.7538ms | 363.1353 Ops/s | 346.7965 Ops/s | |
| test_reinforce_speed[True-backward] | 7.8938ms | 7.5562ms | 132.3416 Ops/s | 117.1584 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.0462ms | 2.7699ms | 361.0254 Ops/s | 358.1144 Ops/s | |
| test_iql_speed[False-None] | 19.7178ms | 19.0446ms | 52.5083 Ops/s | 50.5324 Ops/s | |
| test_iql_speed[False-backward] | 30.4984ms | 29.1289ms | 34.3302 Ops/s | 33.8530 Ops/s | |
| test_iql_speed[True-None] | 8.6892ms | 8.2880ms | 120.6558 Ops/s | 119.5313 Ops/s | |
| test_iql_speed[True-backward] | 16.5068ms | 16.0900ms | 62.1504 Ops/s | 59.4895 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.5973ms | 8.3582ms | 119.6436 Ops/s | 126.5484 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0999ms | 5.9426ms | 168.2758 Ops/s | 167.7115 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.3170ms | 0.3194ms | 3.1307 KOps/s | 2.8762 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5918ms | 0.3372ms | 2.9654 KOps/s | 2.7478 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0114ms | 5.6973ms | 175.5218 Ops/s | 175.3247 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9802ms | 0.3369ms | 2.9680 KOps/s | 3.0051 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6158ms | 0.3205ms | 3.1202 KOps/s | 3.1433 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.7581ms | 1.4243ms | 702.1182 Ops/s | 718.9385 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6861ms | 1.3330ms | 750.2109 Ops/s | 768.5497 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.9526ms | 5.7928ms | 172.6295 Ops/s | 170.0144 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1708ms | 0.4173ms | 2.3962 KOps/s | 2.0570 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6120ms | 0.3994ms | 2.5037 KOps/s | 2.2977 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.7767ms | 5.7005ms | 175.4218 Ops/s | 173.1068 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.5706ms | 0.2774ms | 3.6047 KOps/s | 2.7405 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4614ms | 0.2604ms | 3.8400 KOps/s | 2.8522 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8363ms | 5.6328ms | 177.5324 Ops/s | 177.5115 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5213ms | 0.2711ms | 3.6884 KOps/s | 3.6372 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5707ms | 0.2550ms | 3.9210 KOps/s | 3.2737 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.9158ms | 5.8191ms | 171.8470 Ops/s | 171.9245 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.4712ms | 0.4834ms | 2.0687 KOps/s | 2.0954 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.5830ms | 0.3975ms | 2.5157 KOps/s | 2.0740 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.5723s | 16.3097ms | 61.3131 Ops/s | 55.2560 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 8.9956ms | 1.9075ms | 524.2526 Ops/s | 557.6035 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.4277ms | 1.2658ms | 790.0105 Ops/s | 898.5534 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 7.5058ms | 5.0415ms | 198.3545 Ops/s | 197.4158 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9008ms | 1.6776ms | 596.0773 Ops/s | 569.9939 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 0.9895ms | 0.8643ms | 1.1570 KOps/s | 779.2133 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 8.2236ms | 5.1933ms | 192.5566 Ops/s | 60.4411 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 3.9358ms | 1.7782ms | 562.3735 Ops/s | 531.9410 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 3.8437ms | 1.0888ms | 918.4467 Ops/s | 969.9553 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 36.6465ms | 34.6241ms | 28.8816 Ops/s | 28.2871 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.3685ms | 17.3547ms | 57.6213 Ops/s | 56.9639 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 38.7418ms | 35.6586ms | 28.0438 Ops/s | 27.0746 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.2608ms | 17.8055ms | 56.1624 Ops/s | 56.1268 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 40.9265ms | 37.8613ms | 26.4122 Ops/s | 26.1740 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 20.5641ms | 19.1334ms | 52.2646 Ops/s | 51.8698 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 81.1952μs | 80.2005μs | 12.4687 KOps/s | 12.6401 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1358ms | 0.1355ms | 7.3801 KOps/s | 7.3580 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1032s | 0.1028s | 9.7312 Ops/s | 9.6883 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.4845μs | 2.4777μs | 403.6064 KOps/s | 400.0835 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 36.2977μs | 36.1419μs | 27.6687 KOps/s | 27.3967 KOps/s | |
| test_simple | 0.8919s | 0.8023s | 1.2464 Ops/s | 1.2432 Ops/s | |
| test_transformed | 1.5131s | 1.4204s | 0.7040 Ops/s | 0.6995 Ops/s | |
| test_serial | 2.2512s | 2.2502s | 0.4444 Ops/s | 0.4359 Ops/s | |
| test_parallel | 2.0882s | 1.9533s | 0.5120 Ops/s | 0.5238 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2642ms | 45.0704μs | 22.1875 KOps/s | 22.6350 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 70.4710μs | 24.9672μs | 40.0526 KOps/s | 40.4373 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 76.0910μs | 24.5331μs | 40.7612 KOps/s | 40.0323 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 45.6510μs | 13.6825μs | 73.0859 KOps/s | 73.0016 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 82.8120μs | 47.5163μs | 21.0454 KOps/s | 21.2754 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 71.0710μs | 27.5182μs | 36.3396 KOps/s | 36.1089 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 56.2610μs | 27.4866μs | 36.3814 KOps/s | 35.9515 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 48.8800μs | 16.5430μs | 60.4486 KOps/s | 60.3102 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 96.1320μs | 50.7562μs | 19.7020 KOps/s | 20.0251 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 61.3310μs | 30.8799μs | 32.3836 KOps/s | 33.0312 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 82.2810μs | 27.5170μs | 36.3411 KOps/s | 35.9762 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 68.7810μs | 16.4358μs | 60.8429 KOps/s | 60.7640 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 97.6710μs | 53.2562μs | 18.7772 KOps/s | 18.9587 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 76.0210μs | 32.8031μs | 30.4849 KOps/s | 30.2361 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 62.9010μs | 30.2380μs | 33.0709 KOps/s | 33.3887 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 58.3300μs | 18.9386μs | 52.8023 KOps/s | 52.0126 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 93.3710μs | 51.2815μs | 19.5002 KOps/s | 20.0639 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 64.6410μs | 30.4457μs | 32.8454 KOps/s | 32.6002 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 84.5620μs | 31.5627μs | 31.6829 KOps/s | 31.5638 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 51.1800μs | 18.0827μs | 55.3013 KOps/s | 54.2888 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 2.7268ms | 52.9011μs | 18.9032 KOps/s | 18.9145 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 63.6810μs | 32.9991μs | 30.3039 KOps/s | 29.8325 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 67.1110μs | 34.4598μs | 29.0193 KOps/s | 29.7402 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 67.0910μs | 20.8060μs | 48.0631 KOps/s | 47.8014 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 0.1124ms | 56.1447μs | 17.8111 KOps/s | 17.8472 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 92.3610μs | 35.8843μs | 27.8673 KOps/s | 27.6309 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 67.2910μs | 33.9388μs | 29.4648 KOps/s | 29.5280 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 52.0910μs | 21.0568μs | 47.4905 KOps/s | 47.7812 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.1114ms | 57.4667μs | 17.4014 KOps/s | 17.2438 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 71.4410μs | 38.6155μs | 25.8963 KOps/s | 25.9883 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 78.5010μs | 36.2437μs | 27.5910 KOps/s | 27.7472 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 54.5410μs | 23.2937μs | 42.9300 KOps/s | 42.8839 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7267s | 0.7252s | 1.3789 Ops/s | 1.3279 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7096s | 0.6176s | 1.6191 Ops/s | 1.6168 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7121s | 1.6385s | 0.6103 Ops/s | 0.6092 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5182s | 1.4474s | 0.6909 Ops/s | 0.7025 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9528s | 1.8777s | 0.5326 Ops/s | 0.5292 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7315s | 1.6567s | 0.6036 Ops/s | 0.5951 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6440s | 4.5523s | 0.2197 Ops/s | 0.2207 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.4729s | 4.4100s | 0.2268 Ops/s | 0.2228 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0023s | 1.9392s | 0.5157 Ops/s | 0.5137 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7167s | 1.6398s | 0.6098 Ops/s | 0.6023 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 20.1281ms | 19.7361ms | 50.6685 Ops/s | 49.0989 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1346s | 3.6051ms | 277.3852 Ops/s | 262.8840 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1049ms | 82.0046μs | 12.1944 KOps/s | 12.1507 KOps/s | |
| test_values[td1_return_estimate-False-False] | 47.1625ms | 46.7645ms | 21.3837 Ops/s | 19.8893 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.2882ms | 1.0792ms | 926.6324 Ops/s | 899.9695 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 76.9499ms | 76.6949ms | 13.0387 Ops/s | 12.0831 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3251ms | 1.0762ms | 929.2141 Ops/s | 923.9489 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 20.4464ms | 20.1688ms | 49.5814 Ops/s | 45.8490 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0493ms | 0.7470ms | 1.3387 KOps/s | 1.3478 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7164ms | 0.6701ms | 1.4922 KOps/s | 1.4398 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5423ms | 1.4805ms | 675.4702 Ops/s | 665.9507 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7292ms | 0.6864ms | 1.4570 KOps/s | 1.3844 KOps/s | |
| test_dqn_speed[False-None] | 1.6091ms | 1.5180ms | 658.7645 Ops/s | 651.1626 Ops/s | |
| test_dqn_speed[False-backward] | 2.3732ms | 2.1617ms | 462.5916 Ops/s | 464.6926 Ops/s | |
| test_dqn_speed[True-None] | 1.3063ms | 0.5433ms | 1.8406 KOps/s | 1.8591 KOps/s | |
| test_dqn_speed[True-backward] | 1.1366ms | 1.0665ms | 937.6164 Ops/s | 946.7913 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.6366ms | 0.5616ms | 1.7807 KOps/s | 1.7491 KOps/s | |
| test_ddpg_speed[False-None] | 3.2515ms | 2.8880ms | 346.2575 Ops/s | 350.8086 Ops/s | |
| test_ddpg_speed[False-backward] | 4.5867ms | 4.1769ms | 239.4138 Ops/s | 240.9976 Ops/s | |
| test_ddpg_speed[True-None] | 1.4013ms | 1.2935ms | 773.0963 Ops/s | 785.2755 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4336ms | 2.3201ms | 431.0144 Ops/s | 433.7582 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.3779ms | 1.2989ms | 769.8961 Ops/s | 762.7022 Ops/s | |
| test_sac_speed[False-None] | 8.8643ms | 8.2986ms | 120.5021 Ops/s | 120.0974 Ops/s | |
| test_sac_speed[False-backward] | 11.6604ms | 11.2444ms | 88.9328 Ops/s | 88.8489 Ops/s | |
| test_sac_speed[True-None] | 2.4305ms | 1.7515ms | 570.9456 Ops/s | 566.9629 Ops/s | |
| test_sac_speed[True-backward] | 3.7985ms | 3.4100ms | 293.2528 Ops/s | 297.3175 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 18.5971ms | 10.5041ms | 95.2012 Ops/s | 96.0955 Ops/s | |
| test_redq_deprec_speed[False-None] | 9.9626ms | 9.2719ms | 107.8533 Ops/s | 108.8695 Ops/s | |
| test_redq_deprec_speed[False-backward] | 12.9939ms | 12.4199ms | 80.5158 Ops/s | 81.4374 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6583ms | 2.4677ms | 405.2395 Ops/s | 408.8220 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.3609ms | 4.0314ms | 248.0512 Ops/s | 247.9315 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 15.2734ms | 9.4127ms | 106.2397 Ops/s | 91.0037 Ops/s | |
| test_td3_speed[False-None] | 8.4144ms | 8.1155ms | 123.2216 Ops/s | 122.4892 Ops/s | |
| test_td3_speed[False-backward] | 10.9761ms | 10.5506ms | 94.7814 Ops/s | 93.7792 Ops/s | |
| test_td3_speed[True-None] | 1.6893ms | 1.5880ms | 629.7283 Ops/s | 637.0996 Ops/s | |
| test_td3_speed[True-backward] | 3.1127ms | 3.0009ms | 333.2356 Ops/s | 312.6792 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 64.4766ms | 22.8810ms | 43.7045 Ops/s | 43.9116 Ops/s | |
| test_cql_speed[False-None] | 17.3865ms | 17.0298ms | 58.7207 Ops/s | 58.3399 Ops/s | |
| test_cql_speed[False-backward] | 23.0446ms | 22.3387ms | 44.7654 Ops/s | 43.8571 Ops/s | |
| test_cql_speed[True-None] | 3.6543ms | 3.1432ms | 318.1469 Ops/s | 310.3157 Ops/s | |
| test_cql_speed[True-backward] | 5.6423ms | 5.1837ms | 192.9139 Ops/s | 185.5129 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 18.6264ms | 11.6129ms | 86.1109 Ops/s | 87.9384 Ops/s | |
| test_a2c_speed[False-None] | 4.3401ms | 3.2011ms | 312.3935 Ops/s | 311.8634 Ops/s | |
| test_a2c_speed[False-backward] | 6.5585ms | 6.1047ms | 163.8095 Ops/s | 157.4093 Ops/s | |
| test_a2c_speed[True-None] | 1.4777ms | 1.3077ms | 764.7167 Ops/s | 760.6908 Ops/s | |
| test_a2c_speed[True-backward] | 3.0030ms | 2.8988ms | 344.9710 Ops/s | 329.2582 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.0104ms | 0.9455ms | 1.0576 KOps/s | 1.0425 KOps/s | |
| test_ppo_speed[False-None] | 3.8611ms | 3.7703ms | 265.2298 Ops/s | 253.3178 Ops/s | |
| test_ppo_speed[False-backward] | 8.9965ms | 7.0583ms | 141.6767 Ops/s | 139.7886 Ops/s | |
| test_ppo_speed[True-None] | 1.5346ms | 1.3824ms | 723.3778 Ops/s | 729.8054 Ops/s | |
| test_ppo_speed[True-backward] | 3.2932ms | 3.1861ms | 313.8644 Ops/s | 312.1835 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.0949ms | 1.0133ms | 986.8782 Ops/s | 973.5070 Ops/s | |
| test_reinforce_speed[False-None] | 2.3406ms | 2.2640ms | 441.6965 Ops/s | 440.3910 Ops/s | |
| test_reinforce_speed[False-backward] | 3.4535ms | 3.3750ms | 296.2925 Ops/s | 290.4463 Ops/s | |
| test_reinforce_speed[True-None] | 1.3003ms | 1.2140ms | 823.6981 Ops/s | 795.3064 Ops/s | |
| test_reinforce_speed[True-backward] | 3.0848ms | 2.9796ms | 335.6124 Ops/s | 335.6344 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 16.3051ms | 9.0220ms | 110.8407 Ops/s | 111.8410 Ops/s | |
| test_iql_speed[False-None] | 10.2584ms | 9.3554ms | 106.8903 Ops/s | 106.5948 Ops/s | |
| test_iql_speed[False-backward] | 13.7967ms | 13.3564ms | 74.8703 Ops/s | 74.5519 Ops/s | |
| test_iql_speed[True-None] | 2.2170ms | 2.1066ms | 474.7057 Ops/s | 470.8957 Ops/s | |
| test_iql_speed[True-backward] | 5.0140ms | 4.6085ms | 216.9885 Ops/s | 213.4915 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 17.4750ms | 10.0351ms | 99.6500 Ops/s | 79.3648 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9514ms | 5.7292ms | 174.5446 Ops/s | 171.6759 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0086ms | 0.3505ms | 2.8527 KOps/s | 3.0677 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5702ms | 0.3301ms | 3.0292 KOps/s | 3.3749 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.7648ms | 5.5279ms | 180.9000 Ops/s | 177.7024 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9880ms | 0.3534ms | 2.8299 KOps/s | 3.3163 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6015ms | 0.3390ms | 2.9499 KOps/s | 3.2713 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6827ms | 1.3675ms | 731.2721 Ops/s | 726.0248 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6121ms | 1.2721ms | 786.0770 Ops/s | 775.0676 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.8221ms | 5.7355ms | 174.3523 Ops/s | 170.5487 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.5198ms | 0.5038ms | 1.9850 KOps/s | 2.2149 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6808ms | 0.5051ms | 1.9797 KOps/s | 2.2778 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9480ms | 5.5731ms | 179.4322 Ops/s | 174.3179 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.3996ms | 0.3467ms | 2.8841 KOps/s | 3.2201 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6540ms | 0.3294ms | 3.0361 KOps/s | 3.0590 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.7347ms | 5.5159ms | 181.2934 Ops/s | 176.3974 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.1704ms | 0.3312ms | 3.0197 KOps/s | 3.0124 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5199ms | 0.3188ms | 3.1366 KOps/s | 3.8084 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.8299ms | 5.7477ms | 173.9823 Ops/s | 172.1649 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3644ms | 0.4293ms | 2.3294 KOps/s | 2.0257 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6283ms | 0.4049ms | 2.4695 KOps/s | 2.1090 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.4882ms | 4.8894ms | 204.5229 Ops/s | 197.2680 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 7.5374ms | 2.1431ms | 466.6097 Ops/s | 514.8798 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 10.4682ms | 1.2998ms | 769.3662 Ops/s | 1.1049 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.5898s | 16.7185ms | 59.8139 Ops/s | 196.2574 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 4.1113ms | 1.7572ms | 569.0771 Ops/s | 513.6457 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 10.0164ms | 1.3204ms | 757.3702 Ops/s | 799.5331 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 7.9302ms | 5.1509ms | 194.1392 Ops/s | 50.2263 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.1215ms | 1.9467ms | 513.6868 Ops/s | 508.8817 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 4.1498ms | 1.0799ms | 926.0086 Ops/s | 951.4027 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 36.7523ms | 34.8278ms | 28.7127 Ops/s | 28.2432 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.3390ms | 17.8592ms | 55.9936 Ops/s | 55.5439 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 39.8316ms | 36.6825ms | 27.2610 Ops/s | 27.0360 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.8918ms | 18.0604ms | 55.3697 Ops/s | 54.0312 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 40.7305ms | 38.4017ms | 26.0405 Ops/s | 26.1956 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.7594ms | 20.2252ms | 49.4433 Ops/s | 52.1248 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: ce5c928
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 053df08
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 1ccd157
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 5ecce60
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 2, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 9607fa3
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: ee9c9b3
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: da31145
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 13b969e
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: a1a2b70
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: e4da2ab
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: ea2fe66
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 4d20ae4
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 54f5596
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: b7639e8
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 09d2fa0
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 6f6c674
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 8c27963
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 9322b29
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 50a8064
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 50a8064
Pull-Request: #3437
vmoens
added a commit
that referenced
this pull request
Feb 3, 2026
- Add inference_model.backend config option ("vllm" or "sglang")
- Refactor get_inference_model() to support both backends
- Refactor make_weight_sync_scheme() to support both backends
- Add _get_sglang_inference_model() for SGLang backend
- Add _make_sglang_weight_sync_scheme() for SGLang weight sync
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor <cursoragent@cursor.com>
ghstack-source-id: 50a8064
Pull-Request: #3437
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Users can now run GRPO with either vLLM or SGLang:
inference_model:
backend: "sglang" # or "vllm" (default)
Co-authored-by: Cursor cursoragent@cursor.com