Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View gogoczh's full-sized avatar
  • Central South University
  • Central South University

Block or report gogoczh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ICLR'25] Reconstructive Visual Instruction Tuning

Python 122 6 Updated Apr 9, 2025

This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reas…

Python 720 19 Updated Sep 10, 2025

Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”

Python 5 Updated Oct 17, 2025

Integration of IPython pdb

Python 1,944 150 Updated Jul 28, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,313 135 Updated Aug 12, 2025

The official implement of "Grounded Chain-of-Thought for Multimodal Large Language Models"

Python 18 1 Updated Jul 21, 2025

[NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning".

143 1 Updated Sep 12, 2025

[NeurIPS 2025🔥]Main source code of SRPO framework.

Python 176 18 Updated Sep 21, 2025

[NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains

Python 56 4 Updated Jul 29, 2025

The official implementation of "Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs"

Python 187 7 Updated Oct 9, 2025
Python 7 1 Updated Oct 21, 2025

ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom

Python 1 Updated Mar 27, 2025

code for "CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models"

Python 19 Updated Mar 10, 2025

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning

Python 103 2 Updated Oct 16, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,230 405 Updated Oct 23, 2025

Official codebase for the paper Latent Visual Reasoning

Python 27 Updated Oct 22, 2025

Code for Heima

Python 56 4 Updated Apr 21, 2025

Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

Python 75 2 Updated Jul 27, 2025

An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"

Python 94 3 Updated Sep 28, 2025

Pixel-Level Reasoning Model trained with RL [NeuIPS25]

Python 243 9 Updated Sep 10, 2025

MCOUT: Multimodal Chain of Continuous Thought for Latent Reasoning

Python 9 1 Updated Oct 4, 2025

The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]

Python 161 5 Updated Jun 5, 2025
Python 891 53 Updated Oct 20, 2025

Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)

Python 185 13 Updated Aug 2, 2025

🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code execution & editing

Python 30 1 Updated Oct 20, 2025

[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression

Python 115 5 Updated Apr 12, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,895 294 Updated Oct 24, 2025

Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"

Python 94 5 Updated Aug 26, 2025

The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"

Python 229 15 Updated Oct 18, 2025
Next