Thanks to visit codestin.com
Credit goes to github.com

Skip to content

The reward on the validation set of the CALX example has been staying around 0.5 and is unable to improve or increase. #454

@johnson7788

Description

@johnson7788

Why does the reward on the validation set of my CALX example stay around 0.5 and fail to improve? I am using a 0.5B-parameter model—could the model be too small, or is the reward too sparse? How should I improve it?

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    examplesquestionQuestion about a feature or some usage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions