Lists (28)
Sort Name ascending (A-Z)
3D LLM
3D LLM staff3D Reconstruction
Agent
AIGC-2D
text to image, text to video, etc.AutoRig
Blender
blender tools, tutorialsDataset
datasetDifferentiableRendering
Differentiable Rendering, for texture backing, mesh decimation etc.DigitalHuman3D
3D Digital HumanFoundation Models
Vision foundational models, building blocksFunnyThings
interesting things, learningImg2Img
Image to Image Generation | TranslationLLM/VLM
Pure text large language model or visual large language modelMaya
Mesh Decimation & Processing
Mesh Decimation (edge collapse or remesh based methods)Mesh Segmentation
algorithms about mesh segmentationsMeshGen & TexGen
Auto Regressive based mesh generation | Texture GenerationMISC 3DV
MotionCap
WholeBody/Body/Hand/Head 2D Pose Estimation / 3D Pose Estimation / Motion Capture from RGB image / videoMulti-modal 3D Shape Retrieval
NVS
PI
Primitive Fitting, 3D Assembly
Fit a complicated 3D model (mesh / point cloud) with several predefined 3D primitivesSurvey
TalkingHead
Text/Image to 3D
Text to 3D / Image to 3D Generation models | SDS | LRM | SOTAUtilities
Libraries, frameworks, useful toolsVirtualAvatar
Stars
[SIGGRAPH Asia 2025] WorldExplorer: Towards Generating Fully Navigable 3D Scenes
🐧 在 Linux 上提供一套完整的 Clash / Mihomo(Clash Meta) 代理与管理面板
A simple 3D asset retrieval system based on objaverse. Query any 3D asset using text(CN/EN) or images, inter-modal or cross-modal. Equipped with backend API service and front-end gradio demo.
Miro: Conversational and editable 3D asset generation from text and images
Turn detection for full-duplex dialogue communication
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Maya-ACE: A Reference Client Implementation for NVIDIA ACE Audio2Face Service
Orient Anything V2, NeurIPS 2025 Spotlight
Harness the power of NVIDIA technologies and LangChain to create dynamic avatars from live speech, integrating RIVA ASR and TTS with Audio2Face for real-time, expressive digital interactions.
FantasyWorld: Geometry-Consistent World Modeling via Unified Video and 3D Prediction
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Unity MCP acts as a bridge, allowing AI assistants (like Claude, Cursor) to interact directly with your Unity Editor via a local MCP (Model Context Protocol) Client. Give your LLM tools to manage a…
Official repo of "MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing“
TabletopGen: Instance-Level Interactive 3D Tabletop Scene Generation from Text or Single Image
HY-Motion model for 3D character animation generation.
FishWoWater / TRELLIS.2
Forked from microsoft/TRELLIS.2Native and Compact Structured Latents for 3D Generation, Add API, docker, cog(replicate) support.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
Enable AI assistant clients like Cursor, Windsurf and Claude Desktop to control Unreal Engine through natural language using the Model Context Protocol (MCP).
Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Su…
Sharp Monocular View Synthesis in Less Than a Second