Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View MillX2021's full-sized avatar

Block or report MillX2021

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models

Python 74 2 Updated Oct 21, 2025

📖[IEEE Sensors Journal (JSEN) ] SuperVINS: A Real-Time Visual-Inertial SLAM Framework for Challenging Imaging Conditions (integrated deep learning features)

C++ 358 37 Updated Jun 8, 2025

Hardware System for Humanoid Robots

81 19 Updated Sep 5, 2025

Matrix is an advanced simulation platform that integrates MuJoCo, Unreal Engine 5, and CARLA to provide high-fidelity, interactive environments for robotics research.

108 8 Updated Oct 28, 2025

[ICCV'25] Unified Open-World Segmentation with Multi-Modal Prompts

7 Updated Oct 10, 2025

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Python 620 38 Updated Oct 15, 2025

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Python 252 11 Updated Oct 27, 2025

3DGS-to-PC: Convert a 3D Gaussian splatting scene into a dense point cloud or basic mesh with advanced customisation options and high-accuracy rendered point colours

Python 599 42 Updated Oct 16, 2025

InteriorGS: 3D Gaussian Splatting Dataset of Semantically Labeled Indoor Scenes

159 6 Updated Aug 4, 2025

R1-like Video-LLM for Temporal Grounding

Python 123 3 Updated Jun 20, 2025

[NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding

Python 56 2 Updated Oct 20, 2025

Fully Open Framework for Democratized Multimodal Training

Python 589 40 Updated Oct 21, 2025

VGGT-X: When VGGT Meets Dense Novel View Synthesis

Python 129 1 Updated Oct 26, 2025

[NeurIPS 2025] PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer

15 2 Updated Oct 2, 2025

A simple state update rule to enhance length generalization for CUT3R

Python 472 11 Updated Oct 1, 2025
Python 32 Updated Oct 17, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,654 1,221 Updated Oct 27, 2025

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 2,099 118 Updated Oct 24, 2025

The SAIL-VL2 series model developed by the BytedanceDouyinContent Group

70 5 Updated Sep 18, 2025

Legged Open-Vocabulary Object Navigator

Python 56 2 Updated Oct 11, 2025

[CoRL 2025] Repository relating to "TrackVLA: Embodied Visual Tracking in the Wild"

Python 252 17 Updated Oct 16, 2025

This is a pipeline to construct HD Semantic Map and HD Vector Map by IPNL.

Jupyter Notebook 49 15 Updated Jan 22, 2025

Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"

Jupyter Notebook 252 20 Updated Aug 28, 2025

A Large-Scale Indoor-Outdoor Robot Dataset for Multi-Sensor Fusion Navigation and Mapping

CMake 145 8 Updated Sep 5, 2025
Python 689 12 Updated Sep 24, 2025
Python 38 1 Updated Mar 24, 2025

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

Python 445 13 Updated Sep 22, 2025

(Preprint) ORV: 4D Occupancy-centric Robot Video Generation.

Python 68 1 Updated Sep 3, 2025

OpenFace 3.0 – open-source toolkit for facial landmark detection, action unit detection, eye-gaze estimation, and emotion recognition.

Python 89 12 Updated Jun 10, 2025
Next