mmdit

Here are 2 public repositories matching this topic...

[ICML 2026] ByteDance's All-in-One Video Generation Model for Human-Object Interaction Video Generation

Pytorch Implementation of the paper "M3-TTS: Multi-modal DiT Alignment & Mel-latent for Zero-shot High-fidelity Speech Synthesis"

tts flow-matching mmdit

Add a description, image, and links to the mmdit topic page so that developers can more easily learn about it.

To associate your repository with the mmdit topic, visit your repo's landing page and select "manage topics."