Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Saoyu99's full-sized avatar

Block or report Saoyu99

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 411 79 Updated Nov 10, 2025

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Python 3,186 328 Updated Oct 11, 2025

Geometric-Mean Policy Optimization

Python 90 8 Updated Nov 13, 2025

[ICLR Workshop 2025] An official source code for paper "GuardReasoner: Towards Reasoning-based LLM Safeguards".

Python 159 18 Updated May 19, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,603 2,520 Updated Nov 14, 2025
Python 378 30 Updated Oct 16, 2025

Train transformer language models with reinforcement learning.

Python 16,292 2,291 Updated Nov 14, 2025

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,340 1,134 Updated Sep 26, 2025

Model Context Protocol Servers

TypeScript 72,608 8,751 Updated Nov 11, 2025

[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

689 33 Updated Oct 20, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,357 1,767 Updated Oct 13, 2025

[TMLR 2025] Efficient Reasoning Models: A Survey

Python 276 18 Updated Oct 30, 2025

Code and Data for Tau-Bench

Python 941 148 Updated Aug 28, 2025

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,928 596 Updated Jul 4, 2025

Unleashing the Power of Reinforcement Learning for Math and Code Reasoners

Python 730 44 Updated Jun 6, 2025

Autonomous Agents (LLMs) research papers. Updated Daily.

1,057 78 Updated Nov 7, 2025

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,628 72 Updated May 11, 2025

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 609 63 Updated Jun 24, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)

Python 8,377 812 Updated Nov 9, 2025

Fully open reproduction of DeepSeek-R1

Python 25,640 2,399 Updated Sep 8, 2025

Structured Outputs

Python 12,892 646 Updated Oct 27, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,842 371 Updated Oct 17, 2025

A framework for few-shot evaluation of language models.

Python 10,618 2,851 Updated Nov 11, 2025
Python 29 5 Updated Nov 11, 2024

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Python 442 22 Updated Oct 16, 2024

Function Vectors in Large Language Models (ICLR 2024)

Python 183 40 Updated Apr 17, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,444 7,561 Updated Nov 13, 2025

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,867 115 Updated Jan 21, 2024

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 3,532 283 Updated May 21, 2025

An Efficient "Factory" to Build Multiple LoRA Adapters

Python 355 64 Updated Feb 13, 2025
Next