Tags: danielpalenicek/sbx
Tags
Support for MultiDiscrete and MultiBinary action spaces in PPO (araff… …in#30) * Added support for MultiDiscrete action space to PPO * Added support for MultiBinary action spaces as discrete action spaces with two choices * Added tests for PPO with MultiDiscrete and MultiBinary action spaces * Moved the padding comment * Fixed type errors * Replaced | by Union in type hint to support python < 3.10 * Update ruff * Rename variables * Add more comments and pre-compute variables * Check that actions are not outside action space * [ci skip] Update version --------- Co-authored-by: Antonin Raffin <[email protected]>
Added support for large values for gradient_steps to SAC, TD3, and TQC ( araffin#21) * Added support for large values for gradient_steps to SAC, TD3, and TQC by replacing the unrolled loop with jax.lax.fori_loop * Add comments * Hotfix for train signature * Fixed start index for dynamic_slice_in_dim * Rename policy delay * Fix type annotation * Remove old annotations * Fix off-by-one and improve type annotation * Fix typo * [ci skip] Update README --------- Co-authored-by: Antonin RAFFIN <[email protected]> Co-authored-by: Antonin Raffin <[email protected]>
Fix train signature and update type hints (araffin#24) * Hotfix for train signature * Fix deprecated type hints * Fix mypy * Update optax dep for python 3.8
Fix replay buffer device at load time (araffin#20) * Fix replay buffer device at load time * Fix imports * Update version * Reformat and add test * Fix test * Fix for mypy
Add flatten layer and update dependencies (araffin#18) * Add flatten layer and update dependencies * Reformat
Add DDPG and TD3 (araffin#16) * Update to match SB3 * Update min pytorch version * Remove pytype * Add base TD3 * Add DDPG * Remove unused variables