-
Zhejiang University
- California
-
06:38
(UTC -08:00) - horizonwind2004.github.io
- @HorizonWind2004
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
A Curated Collection of Frontier Language Model Architectures
UniVideo: Unified Understanding, Generation, and Editing for Videos
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
Code release for "SegLLM: Multi-round Reasoning Segmentation"
The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.
HunyuanVideo-1.5: A leading lightweight video generation model
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
VideoCoF: Unified Video Editing with Temporal Reasoner
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / U…
⚡ Dynamically generated stats for your github readmes
[NeurIPS 2025 Spotlight] Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
Official inference repo for FLUX.2 models
Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
Code release for "UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity"
The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
Echo: "Constantly Improving Image Models Need Constantly Improving Benchmarks"
Official implementation of "UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing"
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning
Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-effective, self-iterative optimization loop.
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
(NeurIPS 2025 D&B Track) OverLayBench: A Benchmark for Layout-to-Image Generation with Dense Overlaps