-
Princeton University
- https://hubertstrauss.com
Pinned Loading
-
princeton-pli/what-makes-good-rm
princeton-pli/what-makes-good-rm Public[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
-
princeton-pli/imperfect-rewards
princeton-pli/imperfect-rewards PublicCode for reproducing experiments from the paper "When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient"
Python 2
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.