Comprehensive toolkit for Reinforcement Learning from Human Feedback (RLHF) training, featuring instruction fine-tuning, reward model training, and support for PPO and DPO algorithms with various c…

Python 174 19 Updated Mar 18, 2024

UCSC-VLAA / MedReason

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Python 231 19 Updated Jun 19, 2025

ImprintLab / Medical-Graph-RAG

A Graph RAG System for Evidenced-based Medical Information Retrieval [ACL 2025]

Python 621 105 Updated Oct 18, 2025

RManLuo / Awesome-LLM-KG

Awesome papers about unifying LLMs and KGs

2,491 172 Updated May 2, 2025

SNOWTEAM2023 / MedRAG

Python 220 38 Updated Sep 19, 2025

Graph-RAG / GraphRAG

455 21 Updated Mar 30, 2025

RManLuo / graph-constrained-reasoning

Official Implementation of ICML 2025 Paper: "Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models".

Python 167 14 Updated May 20, 2025

luhengshiwo / LLMForEverybody

每个人都能看懂的大模型知识分享，LLMs春/秋招大模型面试前必看，让你和面试官侃侃而谈

Jupyter Notebook 4,583 446 Updated Oct 13, 2025

theworldofagents / Agentic-Reasoning

free and open OpenAI Deep Research

Python 685 90 Updated Feb 18, 2025

lucidrains / PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,869 682 Updated Oct 11, 2025

SilenceOverflow / Awesome-SLAM

A curated list of SLAM resources

1,027 157 Updated Oct 13, 2023

MCG-NJU / BIVDiff

[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Python 75 2 Updated Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chia-Hsuan Hsu tongyu0924

Achievements

Achievements

Block or report tongyu0924

Stars

pmc-patients / pmc-patients

NVlabs / DiffusionNFT

neverbiasu / Awesome-Portraits-Style-Transfer

zai-org / GLM-4.5

medmcqa / medmcqa

jind11 / MedQA

QwenLM / Qwen-Image

volcengine / verl

thinkwee / AgentsMeetRL

eth-medical-ai-lab / Med-PRM

unslothai / unsloth

MIT-LCP / mimic-code

Vonng / ddia

pubmedqa / pubmedqa

HKUDS / LightRAG

eric-mitchell / direct-preference-optimization

wshi83 / MedAgentGym

JasonHonKL / spy-search

raghavc / LLM-RLHF-Tuning-with-PPO-and-DPO