-
Couldn't load subscription status.
- Fork 51
Add CnnPolicy to PPO #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops, I forgot of send a message here.
|
Hello, |
|
Some report about the performance: I could get good results on Breakout but couldn't get it to work on Pong for now. Runs can be found on W&B: https://wandb.ai/openrlbenchmark/sbx |
Dont worry!!!! I appreciate you looking at my code. |
|
After sharing the CNN features extractor between the actor and the critic, the learning curves match the ones from SB3 =) I'll do a bit more runs and then merge. |
Description
closes #80
Performance report: https://wandb.ai/openrlbenchmark/sbx/reports/PPO-CNN-Performance-report--VmlldzoxNDU2OTk2OA
Motivation and Context
Types of changes
Checklist:
make format(required)make check-codestyleandmake lint(required)make pytestandmake typeboth pass. (required)make doc(required)Note: You can run most of the checks using
make commit-checks.Note: we are using a maximum length of 127 characters per line