Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View fangqi-Zhu's full-sized avatar

Block or report fangqi-Zhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing

Python 930 82 Updated Jan 22, 2026

Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models

Python 131 5 Updated Jan 4, 2026
Python 139 11 Updated Jul 8, 2025

Open-source unified multimodal model

Python 5,595 491 Updated Oct 27, 2025

[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Python 1,310 78 Updated Jan 6, 2026

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,593 499 Updated Jan 27, 2026

A curated list of recent robot learning papers incorporating diffusion models for robotics tasks.

303 8 Updated Jun 13, 2025

Official repository for "iVideoGPT: Interactive VideoGPTs are Scalable World Models" (NeurIPS 2024), https://arxiv.org/abs/2405.15223

Python 163 17 Updated Sep 23, 2025

CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making

Jupyter Notebook 689 66 Updated Apr 20, 2025

Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence

Python 1,363 93 Updated Jan 31, 2025

This repository implements a Best-of-N (BoN) strategy for inference-aware fine-tuning of large language models. The system supports multiple leading LLM providers and includes comprehensive testing…

Python 3 Updated Dec 26, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 28,452 2,875 Updated Apr 30, 2025

Scalable and memory-optimized training of diffusion models

Python 1,327 142 Updated Jun 4, 2025
Python 425 21 Updated Nov 29, 2025

Evaluating FSD on SimplerEnv

Jupyter Notebook 8 Updated Jul 10, 2025

Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo, and OpenVLA) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)

Jupyter Notebook 261 43 Updated Jun 23, 2025

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,644 1,177 Updated Nov 21, 2025

Official implementation of our paper: "Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing" (ICML 2025)

Python 77 1 Updated May 22, 2025

📚 Collection of awesome generation acceleration resources.

386 13 Updated Jul 7, 2025

FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.

Python 52 2 Updated Jul 8, 2024

Inference script for Oasis 500M

Python 2,028 170 Updated Nov 8, 2024

RoboTwin 2.0 Offical Repo

Python 1,872 257 Updated Jan 24, 2026

High-speed Large Language Model Serving for Local Deployment

C++ 8,603 479 Updated Jan 24, 2026

world modeling challenge for humanoid robots

Python 546 47 Updated Nov 8, 2024

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Python 1,080 140 Updated Dec 18, 2025

Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion

Python 301 11 Updated May 17, 2025

Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers(ICCV2023)

Python 17 Updated Sep 30, 2023
Python 287 33 Updated Aug 17, 2025
Next