Stars
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
[IROS 2025 Award Finalist] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Janus-Series: Unified Multimodal Understanding and Generation Models
[NeurIPS 2025] Improving Video Generation with Human Feedback
A Next-Generation Training Engine Built for Ultra-Large MoE Models
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
FastPillars: A Deployment-friendly Pillar-based 3D Detector
codes for RFSR: Improving ISR Diffusion Models via Reward Feedback Learning
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
A curated list of awesome knowledge-driven autonomous driving (continually updated)
UniMD: Towards Unifying Moment retrieval and temporal action Detection
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A curated list of awesome LLM/VLM/VLA for Autonomous Driving(LLM4AD) resources (continually updated)
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024
✨✨Latest Advances on Multimodal Large Language Models
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation