Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View BaiRiDreamer's full-sized avatar
  • Southern University of Science and Technology
  • 中国深圳

Block or report BaiRiDreamer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A comprehensive collection of process reward models.

134 3 Updated Oct 4, 2025

AllenAI's post-training codebase

Python 3,548 489 Updated Jan 24, 2026

Textbook on reinforcement learning from human feedback

TeX 1,425 127 Updated Jan 24, 2026

Democratizing Reinforcement Learning for LLMs

Python 5,025 489 Updated Jan 23, 2026

Clash 下载|持续收藏Clash版本|2026年1月更新

915 44 Updated Jan 16, 2026

2026年1月更新,百度网盘(百度云)不限速工具分享。

JavaScript 1,325 66 Updated Dec 13, 2025

南科大tis选课系统抢课脚本 操作简单 功能稳定 效率可观

Python 224 33 Updated Sep 2, 2025

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,391 1,343 Updated Jul 9, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,644 3,089 Updated Jan 23, 2026

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

Python 831 67 Updated Jan 21, 2026

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 66,355 8,071 Updated Jan 20, 2026

多模态 MM +Chat 合集

Python 281 22 Updated Aug 19, 2025
Python 129 9 Updated Jun 6, 2025

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Python 91 13 Updated Dec 3, 2024

🎥 Command line media player

C 33,708 3,205 Updated Jan 23, 2026

大模型基础: 一文了解大模型基础知识

6,623 555 Updated Dec 18, 2025

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 6,265 441 Updated Dec 5, 2025

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 859 88 Updated Dec 10, 2025

Official Repo for Open-Reasoner-Zero

Python 2,084 118 Updated Jun 2, 2025

Curated list of datasets and tools for post-training.

4,190 346 Updated Nov 10, 2025

Summarize existing representative LLMs text datasets.

1,426 140 Updated Oct 11, 2025

欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩‍🎓👨‍🎓

Python 924 86 Updated Dec 1, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,165 2,318 Updated Sep 3, 2025

This is the repository for the Tool Learning survey.

476 15 Updated Aug 9, 2025

Fully open reproduction of DeepSeek-R1

Python 25,838 2,410 Updated Nov 24, 2025

Synthetic data curation for post-training and structured data extraction

Python 1,609 131 Updated Jan 24, 2026

A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval

JavaScript 12,927 1,271 Updated Jan 18, 2026

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 41,381 4,694 Updated Jan 23, 2026

DataComp for Language Models

HTML 1,409 129 Updated Sep 9, 2025
Next