Thanks to visit codestin.com
Credit goes to github.com

01yzzyu

Follow

🏠

Working

Zhongyu Yang 01yzzyu

🏠

Working

Follow

AI Learner

3 followers · 3 following

LZU / CUHKSZ / KAUST
Shenzhen
16:23 (UTC +08:00)
https://01yzzyu.github.io/
https://scholar.google.com/citations?hl=zh-CN&user=x2VGVvcAAAAJ

Achievements

Achievements

Lists (13)

Sort

Benchmark

Denoising

10 repositories

Efficient

22 repositories

Hallucination

29 repositories

mamba

memory

MLLM

28 repositories

Music

36 repositories

🚀 My stack

15 repositories

self-play

Traditional CV

Trend Focus

23 repositories

Video Agent

Stars

EMMA-Bench / EMMA

[ICML 2025 Oral] The official repository for the paper "Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark"

Python 70 Updated Jul 17, 2025

MME-Benchmarks / MME-CoT

MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency

Python 136 5 Updated Aug 5, 2025

XingruiWang / XModBench

XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

Python 4 Updated Oct 23, 2025

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

998 31 Updated Aug 17, 2025

ZhishanQ / QuCo-RAG

Official code implementation of the paper: QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Python 16 2 Updated Dec 25, 2025

AlonzoLeeeooo / awesome-image-inpainting-studies

A collection of awesome image inpainting studies.

TeX 357 25 Updated Dec 11, 2025

showlab / VLog

[CVPR 2025] Video Narration as Vocabulary & Video as Long Document

Python 582 31 Updated Mar 13, 2025

cheryyunl / ROVER

Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

Python 27 Updated Dec 12, 2025

Fr0zenCrane / UniCoT

Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision

Python 185 3 Updated Dec 23, 2025

PhoenixZ810 / RISEBench

[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Python 130 7 Updated Dec 17, 2025

PKU-YuanGroup / UniWorld

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Python 822 25 Updated Dec 23, 2025

SkyworkAI / UniPic

Unified Multimodal Model for image generation/editing/understanding

Python 820 38 Updated Sep 8, 2025

VectorSpaceLab / OmniGen2

OmniGen2: Exploration to Advanced Multimodal Generation.

Jupyter Notebook 3,975 12 Updated Dec 2, 2025

JiuhaiChen / BLIP3o

Official implementation of BLIP3o-Series

Python 1,613 73 Updated Nov 29, 2025

bytedance / mammothmoda

Python 36 2 Updated Dec 11, 2025

baaivision / Emu3

Next-Token Prediction is All You Need

Python 2,271 91 Updated Nov 19, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,646 2,233 Updated Feb 1, 2025

showlab / Show-o

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,832 81 Updated Dec 15, 2025

Hritikbansal / videophy

Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics

Python 166 12 Updated May 6, 2025

InternRobotics / MMSI-Bench

[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Python 67 Updated Dec 23, 2025

mll-lab-nu / MindCube

Python 114 3 Updated Nov 1, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 21,959 3,863 Updated Dec 25, 2025

Vchitect / Uni-MMMU

Python 20 1 Updated Dec 10, 2025

Purshow / Awesome-Unified-Multimodal

📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.

339 15 Updated Oct 16, 2025

Yufang-Liu / visual_modality_role

[ACL 2025] The Role of Visual Modality in Multimodal Mathematical Reasoning: Challenges and Insights

Python 7 1 Updated Jun 12, 2025

jingyi0000 / R1-VL

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Python 446 Updated Dec 16, 2025

EIT-NLP / Awesome-Latent-CoT

This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.

243 5 Updated Dec 23, 2025

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,227 40 Updated Dec 23, 2025

alibaba-damo-academy / MedEvalKit

MedEvalKit: A Unified Medical Evaluation Framework

Python 193 17 Updated Oct 23, 2025

alibaba-damo-academy / ReasonMed

ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Python 104 6 Updated Oct 28, 2025