Popular repositories Loading
-
sudoku_trl_grpo
sudoku_trl_grpo PublicForked from 828Tina/sudoku_trl_grpo
基于trl框架对Qwen模型做grpo训练,从而完成4*4数独游戏的训练任务
Python
-
trl
trl PublicForked from huggingface/trl
Train transformer language models with reinforcement learning.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.