-
Nankai University
- Tianjin, China
-
22:35
(UTC -12:00) - https://www.nankai.edu.cn/
Stars
This is the official code (based on Pytorch framework) for the paper "Open-Det: An Efficient Learning Framework for Open-Ended Detection".
[NeurIPS 2025] the official project page of a paper, "PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting"
starVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
DemoGrasp: Universal Dexterous Grasping from a Single Demonstration
MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Democratizing AI scientists with ToolUniverse
Offical implementation of "Visual Instruction Pretraining for Domain-Specific Foundation Models"
Tongyi Deep Research, the Leading Open-source Deep Research Agent
InternVLA-A1: Unifying Understanding, Generation, and Action for Robotic Manipulation
Fully Open Framework for Democratized Multimodal Training
WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
Building General-Purpose Robots Based on Embodied Foundation Model
[NeurIPS 2025] CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
verl: Volcano Engine Reinforcement Learning for LLMs
[TPAMI2025] Improving Generalized Visual Grounding with Instance-aware Joint Learning
The code implementation for the paper "DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation".
A Comprehensive Survey on Continual Learning in Generative Models.