Stars
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
Making large AI models cheaper, faster and more accessible
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具
EVA Series: Visual Representation Fantasies from BAAI
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
AI绘画资料合集(包含国内外可使用平台、使用教程、参数教程、部署教程、业界新闻等等) Stable diffusion、AnimateDiff、Stable Cascade 、Stable SDXL Turbo
Fast Example-based Image Synthesis and Style Transfer
AUTOMATIC1111 UI extension for creating videos using img2img and ebsynth.
A pytorch implementation of dreamfields with modifications.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Official repo for consistency models.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Stable Diffusion web UI
A latent text-to-image diffusion model
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
PaddleSlim is an open-source library for deep model compression and architecture search.
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.
A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D object detection models.