r1

Star

Here are 28 public repositories matching this topic...

zzli2022 / Awesome-System2-Reasoning-LLM

Star

Latest Advances on System-2 Reasoning

benchmark mcts rl reasoning r1 prm o3 o1 slow-fast system-2 self-improve macro-action

Updated Jun 8, 2025
Python

RUC-NLPIR / Search-o1

Star

🔍 Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]

math livecode amc reasoning r1 rag qwq aimo o1 gpqa

Updated Aug 21, 2025
Python

turningpoint-ai / VisualThinker-R1-Zero

Star

Explore the Multimodal “Aha Moment” on 2B Model

reinforcement-learning reasoning r1 post-training multimodal deepseek deepseek-r1 grpo deepseek-r1-zero r1-zero multimodal-journey multimodal-r1

Updated Mar 18, 2025
Python

jingyi0000 / R1-VL

Star

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

reinforcement-learning reasoning r1 mllm vision-language-model multimodal-large-language-models

Updated Oct 21, 2025
Python

modelscope / awesome-deep-reasoning

Star

Collect every awesome work about r1!

collection rl reasoning r1 o1 qwen deepseek grpo

Updated May 2, 2025
Python

RyanLiu112 / compute-optimal-tts

Star

Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".

r1 o1 large-language-model process-reward-model test-time-scaling

Updated Feb 19, 2025
Python

Zeyi-Lin / Qwen3-Medical-SFT

Star

Qwen3 Fine-tuning: Medical R1 Style Chat

r1 fine-tuning sft qwen3

Updated May 31, 2025
Python

ritzz-ai / GUI-R1

Star

Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents

deep-reinforcement-learning r1 multimodal o1 multimodal-large-language-models large-multimodal-models gui-agent grpo mllm-reasoning

Updated May 5, 2025
Python

SmallDoges / small-doge

Star

Doge Family of Small Language Models

python nlp natural-language-processing reinforcement-learning deep-learning pytorch transformer chinese webui attention-mechanism r1 attention-is-all-you-need mechine-learning foundation-models small-language-models dynamic-mask-attention cross-domain-mixture-of-experts deepseek-r1

Updated Aug 13, 2025
Python

sun-hailong / TVC

Star

[ACL 2025] The code repository for "Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning" in PyTorch.

reasoning r1 cot forgetting mllms multimodel-large-language-model

Updated May 16, 2025
Python

CJReinforce / PURE

Star

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

reinforcement-learning mathematics rl reasoning r1 o1 llm reinforcement-finetuning

Updated Oct 23, 2025
Python

lll6gg / UI-R1

Star

Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"

reinforcement-learning r1 multimodal-learning multimodal-large-language-models gui-agent efficient-reasoning

Updated May 26, 2025
Python

LLM360 / Reasoning360

Star

A repo for open research on building large reasoning models

rl reasoning r1 llm qwen

Updated Oct 22, 2025
Python

RyanLiu112 / GenPRM

Star

Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".

r1 o1 large-language-model process-reward-model test-time-scaling

Updated Jun 4, 2025
Python

tyler-romero / microR1

Star

Simple repository for training small reasoning models

reasoning r1 deepseek grpo

Updated Feb 6, 2025
Python

sylvain-wei / 24-Game-Reasoning

Star

超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1

alignment reasoning r1 post-training cot sft o1 24game llm rlhf deepseek r1-zero verl long-cot

Updated Apr 5, 2025
Python

lachlancresswell / AutoR1

Star

Auto-generate fallback and meter display from existing group info in d&b audiotechnik's R1 and ArrayCalc software.

r1 dbaudio dbaudiotechnik arraycalc

Updated May 4, 2025
Python

The-Swarm-Corporation / AgentGym

Star

A framework making it effortless to convert any llm model into a reasoning agent like o1 or DeepSeek's r1

ai rl agents alibaba r1 o1 llms qwen deepseek

Updated Oct 13, 2025
Python

sdiehl / tiny-r1

Star

Recreating the minimal training methods of DeepSeek-R1 for small langauge models.

reasoning r1 grpo grpotrainer

Updated Feb 10, 2025
Python

BY571 / DistRL-LLM

Star

Distributed Reinforcement Learning for LLM Fine-Tuning with multi-GPU utilization

reinforcement-learning pg r1 multi-gpu-training multi-gpu-inference llm llm-training llm-finetuning llm-fine-tuning grpo reinforcement-learning-fine-tuning

Updated Mar 12, 2025
Python

Improve this page

Add a description, image, and links to the r1 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the r1 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

r1

Here are 28 public repositories matching this topic...

zzli2022 / Awesome-System2-Reasoning-LLM

RUC-NLPIR / Search-o1

turningpoint-ai / VisualThinker-R1-Zero

jingyi0000 / R1-VL

modelscope / awesome-deep-reasoning

RyanLiu112 / compute-optimal-tts

Zeyi-Lin / Qwen3-Medical-SFT

ritzz-ai / GUI-R1

SmallDoges / small-doge

sun-hailong / TVC

CJReinforce / PURE

lll6gg / UI-R1

LLM360 / Reasoning360

RyanLiu112 / GenPRM

tyler-romero / microR1

sylvain-wei / 24-Game-Reasoning

lachlancresswell / AutoR1

The-Swarm-Corporation / AgentGym

sdiehl / tiny-r1

BY571 / DistRL-LLM

Improve this page

Add this topic to your repo