Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Feature] KL Transform for RLHF#1196

Merged
vmoens merged 31 commits intomainfrom
kl_transform
May 30, 2023
Merged

[Feature] KL Transform for RLHF#1196
vmoens merged 31 commits intomainfrom
kl_transform

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented May 26, 2023

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 26, 2023
@vmoens vmoens added the enhancement New feature or request label May 26, 2023
Copy link
Contributor

@tcbegley tcbegley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

actor, keep_params=False, funs_to_decorate=["forward", "get_dist"]
)
self.functional_actor = deepcopy(actor)
repopulate_module(actor, params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we repopulate here when below we seem to only call with params=self.frozen_params? Is this so that the caller can continue to use actor having supplied it as an argument to the constructor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want users to get an actor without parameters, that's all
there wouldn't be much to train :p

vmoens added 2 commits May 30, 2023 15:47
# Conflicts:
#	torchrl/csrc/utils.h
#	torchrl/modules/distributions/continuous.py
@vmoens vmoens merged commit e8f5efe into main May 30, 2023
@vmoens vmoens deleted the kl_transform branch May 30, 2023 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants