Thanks to visit codestin.com
Credit goes to github.com

yangcaoai

Follow

Yang Cao yangcaoai

Follow

Machine learning and computer vision

86 followers · 178 following

Achievements

Achievements

Stars

agents-x-project / PyVision

[MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."

Python 144 6 Updated Jul 22, 2025

Wakals / CoVT

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

Python 264 16 Updated Jan 6, 2026

m-Just / InSight-o3

Jupyter Notebook 7 1 Updated Jan 17, 2026

longvideoagent / LongVideoAgent

81 4 Updated Dec 30, 2025

W-Ted / N3D-VLM

Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Python 77 2 Updated Jan 14, 2026

EO-Robotics / EO1

EO: Open-source Unified Embodied Foundation Model Series

Jupyter Notebook 282 26 Updated Nov 12, 2025

facebookresearch / MetaCLIP

NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

Python 1,802 75 Updated Nov 27, 2025

facebookresearch / metaquery

Official Implementation of Paper Transfer between Modalities with MetaQueries

Python 291 11 Updated Oct 12, 2025

LaVi-Lab / VG-LLM

The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'

Jupyter Notebook 192 6 Updated Nov 28, 2025

facebookresearch / DepthLM_Official

Official implementation of DepthLM

Python 286 13 Updated Jan 6, 2026

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,570 487 Updated Oct 27, 2025

cvlab-kaist / C3G

Official implementation of "C3G: Learning Compact 3D Representations with 2K Gaussians"

Python 122 3 Updated Jan 13, 2026

EvolvingLMMs-Lab / NEO

NEO Series: Native Vision-Language Models from First Principles

Python 630 22 Updated Jan 9, 2026

LengSicong / MMR1

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Python 212 9 Updated Sep 26, 2025

HorizonRobotics / Uni3R

Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images

Python 115 4 Updated Sep 3, 2025

vivoCameraResearch / p-p-c

28 1 Updated Jun 3, 2025

MiZhenxing / One4D

One4D: Unified 4D Generation and Reconstruction

69 2 Updated Dec 2, 2025

cambrian-mllm / cambrian-s

Cambrian-S: Towards Spatial Supersensing in Video

Python 480 17 Updated Dec 27, 2025

Tencent-Hunyuan / HunyuanVideo-1.5

HunyuanVideo-1.5: A leading lightweight video generation model

Python 3,408 116 Updated Jan 2, 2026

worldbench / 3EED

[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D

Python 198 11 Updated Dec 26, 2025

LTH14 / JiT

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 1,996 124 Updated Dec 8, 2025

facebookresearch / sam-3d-objects

SAM 3D Objects

Python 5,638 595 Updated Jan 9, 2026

Wan-Video / Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python 13,659 1,620 Updated Dec 17, 2025

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Python 4,014 356 Updated Dec 12, 2025

vulab-AI / Awesome-Spatial-VLMs

[Awesome-Spatial-VLMs] This repository is the official, community-maintained resource for the survey paper: Spatial Intelligence in Vision-Language Models: A Comprehensive Survey;

Python 53 2 Updated Jan 7, 2026

Seed3D / Seed3D

195 3 Updated Oct 22, 2025

LiuJF1226 / Mono4DGS-HDR

Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos

23 1 Updated Oct 22, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 67,707 12,640 Updated Jan 17, 2026

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,519 60 Updated Jun 14, 2025

elisabettafedele / superdec

[ICCV 2025] SuperDec: 3D Scene Decomposition with  Superquadric Primitives.

Python 162 10 Updated Dec 31, 2025