Stars
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
A curated list of awesome HD map construction methods
Official implementation for "JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation"
InternRobotics' open platform for building generalized navigation foundation models.
[NeurIPS 2025] CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
A curated list of large VLM-based VLA models for robotic manipulation.
missTL / SeqGrowGraph
Forked from MIV-XJTU/SeqGrowGraphSeqGrowGraph: Learning Lane Topology as a Chain of Graph Expansions
Code for Streaming 4D Visual Geometry Transformer
[RSS'25] This repository is the implementation of "NaVILA: Legged Robot Vision-Language-Action Model for Navigation"
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
[RSS 2024 & RSS 2025] VLN-CE evaluation code of NaVid and Uni-NaVid
[ICCV 2025] Official implementation for "SeqGrowGraph: Learning Lane Topology as a Chain of Graph Expansions"
Vision-and-Language Navigation in Continuous Environments using Habitat
[RSS 2025] Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks.
📚这个仓库是在arxiv上收集的有关VLN,VLA, SLAM,Gaussian Splatting,非线性优化等相关论文。每天都会自动更新!issue区域是最新10篇论文
Zhaoyibinn / vggt
Forked from facebookresearch/vggt[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer
An app for collecting raw RGB-D scans on iOS devices.
Application for camera and sensor data logging (iOS)
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
missTL / FSDrive
Forked from MIV-XJTU/FSDriveThe repository has been moved to https://github.com/MIV-XJTU/FSDrive
[NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving"
We’re looking forward to models based on DINOv3. Rankings include: BetterDepth BRIDGE BriGeS ChronoDepth Depth Any Video Depth Anything Depth Pro DepthCrafter Distill Any Depth FE2E GRIN M2SVid MAS…
Python tools for rendering, viewing and generating metric 3D depth videos. Tools for recovering and exporting camera pose and 3D geometry to popular formats as well as tools for projecting depthvid…
Collect some World Models for Autonomous Driving (and Robotic) papers.
Summary of LLM for Autonomous Driving papers (continuously updated)