Thanks to visit codestin.com
Credit goes to github.com

yongliang-wu

Follow

🏠

Working from home

Yongliang Wu yongliang-wu

🏠

Working from home

Follow

A master student at Southeast University.

55 followers · 34 following

Southeast University
Shanghai
17:02 (UTC +08:00)
https://yongliang-wu.github.io/
https://scholar.google.com/citations?user=NdE8DZ8AAAAJ

Achievements

Achievements

Stars

Optimization-AI / DisCO

Discriminative Constrained Optimization for Reinforcing Large Reasoning Models

Python 43 2 Updated Oct 28, 2025

TencentCloudADP / youtu-agent

A simple yet powerful agent framework that delivers with open-source models

Python 3,687 359 Updated Oct 29, 2025

Shopee-MUG / MUG-V

MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

Python 74 2 Updated Oct 21, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 18,392 1,213 Updated Oct 25, 2025

yix8 / VisualPlanning

Visual Planning: Let's Think Only with Images

Python 280 9 Updated May 20, 2025

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 545 45 Updated Oct 21, 2025

Espere-1119-Song / VideoNSA

VideoNSA: Native Sparse Attention Scales Video Understanding

Python 51 1 Updated Oct 8, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,133 54 Updated Aug 27, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 589 40 Updated Oct 21, 2025

callsys / GMPO

Geometric-Mean Policy Optimization

Python 88 8 Updated Oct 18, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 356 42 Updated Oct 4, 2025

NVlabs / NFT

Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasoning"

Python 44 4 Updated Sep 8, 2025

baidubce / Qianfan-VL

Qianfan-VL: Domain-Enhanced Universal Vision-Language Models

163 11 Updated Sep 22, 2025

TsinghuaC3I / Awesome-RL-for-LRMs

A Survey of Reinforcement Learning for Large Reasoning Models

1,925 108 Updated Oct 29, 2025

chatanywhere / GPT_API_free

Free ChatGPT&DeepSeek API Key，免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API，支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。

Python 33,862 2,416 Updated Oct 10, 2025

TsinghuaC3I / Unify-Post-Training

Towards a Unified View of Large Language Model Post-Training

Python 170 8 Updated Sep 8, 2025

11cafe / jaaz

The world's first open-source multimodal creative assistant This is a substitute for Canva and Manus that prioritizes privacy and is usable locally.

TypeScript 5,051 437 Updated Sep 24, 2025

lupantech / MathVista

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Jupyter Notebook 342 50 Updated Sep 29, 2025

zwhong714 / PSFT

PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, constraining policy drift to stabilize training and improve generalization.

Python 27 1 Updated Sep 9, 2025

aiben-ch / LMM-Evaluation-Survey

Official repo for 'Large Multimodal Models Evaluation: A Survey'

86 4 Updated Oct 20, 2025

ForJadeForest / verl

Forked from volcengine/verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 1 Updated Sep 5, 2025

VlSomers / awesome-computer-vision-conference-deadline

A curated list of Computer Vision related conferences with dates and paper registration deadlines.

42 5 Updated Sep 22, 2025

stepfun-ai / NextStep-1

Python 562 15 Updated Oct 20, 2025

wuhuaijin / QVAE

Official implementation for the paper "QVAE-Mole: The Quantum VAE with Spherical Latent Variable Learning for 3-D Molecule Generation" (NeurIPS 2024).

Python 8 2 Updated Jun 4, 2025

Chengsong-Huang / R-Zero

codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)

Python 657 64 Updated Oct 3, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,905 294 Updated Oct 24, 2025

Lauorie / DFT

Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629

Python 16 1 Updated Oct 14, 2025

scenarios / VAE-2

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild

Python 3 Updated Jan 28, 2021

LiveBench / LiveBench

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Python 905 83 Updated Oct 16, 2025

Shredded-Pork / TempFlow-GRPO

TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.

Python 802 45 Updated Oct 28, 2025