Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tanh transform instead of clipping of log std in SAC #4

@keraJLi

Description

@keraJLi

See here. Interestingly, the log std is not clipped or otherwise bounded in PPO, but its also invariant to the state there. Maybe investiage differences.
On a different note, SAC.action_dist should also be made private, since returning a distrax distribution from a jitted function does not work last time I checked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions