Add CnnPolicy to PPO #81

paulo101977 · 2025-09-23T20:13:44Z

Description

closes #80

Performance report: https://wandb.ai/openrlbenchmark/sbx/reports/PPO-CNN-Performance-report--VmlldzoxNDU2OTk2OA

Motivation and Context

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have reformatted the code using make format (required)
I have checked the codestyle using make check-codestyle and make lint (required)
I have ensured make pytest and make type both pass. (required)
I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

paulo101977

Ops, I forgot of send a message here.

araffin · 2025-09-26T10:30:39Z

Hello,
thanks for the PR, but the code you provided was actually not used.
I refactored, fixed the code and updated the tests.

araffin · 2025-09-26T14:18:49Z

Some report about the performance: I could get good results on Breakout but couldn't get it to work on Pong for now.
After checking with SB3, this is due to the shared vs non-shared CNN.

Runs can be found on W&B: https://wandb.ai/openrlbenchmark/sbx

paulo101977 · 2025-09-26T16:06:28Z

Hello, thanks for the PR, but the code you provided was actually not used. I refactored, fixed the code and updated the tests.

Dont worry!!!!

I appreciate you looking at my code.

araffin · 2025-09-29T15:05:00Z

After sharing the CNN features extractor between the actor and the critic, the learning curves match the ones from SB3 =)
https://wandb.ai/openrlbenchmark/sbx/reports/PPO-CNN-Performance-report--VmlldzoxNDU2OTk2OA

I'll do a bit more runs and then merge.

paulo101977 added 2 commits September 23, 2025 16:19

Add CnnPolicy to PPO

ad399c0

After run tests

12f7e31

paulo101977 commented Sep 23, 2025

View reviewed changes

Remove unused key in CrossQ

9fcbd7d

araffin requested a review from Copilot September 26, 2025 10:27

This comment was marked as resolved.

Sign in to view

araffin added 2 commits September 26, 2025 12:28

Move NatureCNN to Jax layers

6103b1e

Refactor CNN implementation for PPO to actually use the CNN

12f5922

Share features extractor with selective copy

dfa91f2

araffin mentioned this pull request Sep 29, 2025

Adding Feature Extractors #3

Open

Rename method

8086149

araffin approved these changes Sep 29, 2025

View reviewed changes

araffin added 2 commits September 29, 2025 19:17

Update comments

9ba6571

Update name

ce2ba79

araffin merged commit d792a63 into araffin:master Sep 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add CnnPolicy to PPO #81

Add CnnPolicy to PPO #81

Uh oh!

paulo101977 commented Sep 23, 2025 •

edited by araffin

Loading

Uh oh!

paulo101977 left a comment

Uh oh!

This comment was marked as resolved.

Uh oh!

araffin commented Sep 26, 2025

Uh oh!

araffin commented Sep 26, 2025

Uh oh!

paulo101977 commented Sep 26, 2025 •

edited

Loading

Uh oh!

araffin commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add CnnPolicy to PPO #81

Add CnnPolicy to PPO #81

Uh oh!

Conversation

paulo101977 commented Sep 23, 2025 • edited by araffin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Types of changes

Checklist:

Uh oh!

paulo101977 left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

araffin commented Sep 26, 2025

Uh oh!

araffin commented Sep 26, 2025

Uh oh!

paulo101977 commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

araffin commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

paulo101977 commented Sep 23, 2025 •

edited by araffin

Loading

paulo101977 commented Sep 26, 2025 •

edited

Loading