Thanks to visit codestin.com
Credit goes to github.com

ShareableXue

Follow

Shareable ShareableXue

Follow

3 followers · 3 following

Popular repositories Loading

sudoku_trl_grpo sudoku_trl_grpo Public

Forked from 828Tina/sudoku_trl_grpo

基于trl框架对Qwen模型做grpo训练，从而完成4*4数独游戏的训练任务

Python
trl trl Public

Forked from huggingface/trl

Train transformer language models with reinforcement learning.

Python