-
Southern University of Science and Technology
- 中国深圳
Lists (5)
Sort Name ascending (A-Z)
Stars
A comprehensive collection of process reward models.
Textbook on reinforcement learning from human feedback
verl: Volcano Engine Reinforcement Learning for LLMs
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Reverse Engineering: Decompiling Binary Code with Large Language Models
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
Official Repo for Open-Reasoner-Zero
Curated list of datasets and tools for post-training.
Summarize existing representative LLMs text datasets.
欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
This is the repository for the Tool Learning survey.
Fully open reproduction of DeepSeek-R1
Synthetic data curation for post-training and structured data extraction
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.