Stars
PMC-Patients: A Large-scale Dataset of Patient Summaries and Relations for Benchmarking Retrieval-based Clinical Decision Support Systems.
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
A curated collection of papers on portrait style transfer
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
verl: Volcano Engine Reinforcement Learning for LLMs
An Awesome List of Agentic Model trained with Reinforcement Learning
[EMNLP 2025] Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
《Designing Data-Intensive Application》DDIA 第一版 / 第二版 中文翻译
PubMedQA: A Dataset for Biomedical Research Question Answering
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
Reference implementation for DPO (Direct Preference Optimization)
[arxiv'25] MedAgentGYM: Training LLM Agents for Code-Based Medical Reasoning at Scale
LLM search engine faster than perplexity!
Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c…
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
A Graph RAG System for Evidenced-based Medical Information Retrieval [ACL 2025]
Official Implementation of ICML 2025 Paper: "Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models".
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
free and open OpenAI Deep Research
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models