News: This paper is accepted by the WACV 2024 4th Workshop on Image/Video/Audio Quality in Computer Vision and Generative AI
For more details, visit the Project Page.
Diffusion Prism is a training-free framework that efficiently transforms binary masks into realistic and diverse samples while preserving morphological features. We explored that a small amount of artificial noise will significantly assist the image-denoising process. To prove this novel mask-to-image concept, we use nano-dendritic patterns as an example to demonstrate the merit of our method compared to existing controllable diffusion models. We also extend the proposed framework to other biological patterns, highlighting its potential applications across various fields.
-
First, please download stable-diffusion-v1-5 model file from: https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5/resolve/main/v1-5-pruned.ckpt and place it into the
diffusion_prism\models\ldm\stable-diffusion-v1\folder. -
Run
mask_diffuser_demo.pyas a demo to show the proposed 'perlin_mask' method from the paper. -
Go to exp to check the evaluation-related functions such as
random_forest.py
Runother_dataset_eval.pyto produce the results of FID, CLIP Score, and SSIM. It will generate annotation for the test folder as well.
We will update more details later according to the request. Please contact us anytime if you have questions.
- Dataset: Download from Google Drive
- Training-Free Diffusion Framework: Generates images from binary skeletons without the need for model training or fine-tuning.
- Diverse Backgrounds: Creates images with varied and realistic backgrounds, enhancing model generalizability.
Diffusion Process:
- Combines masks with controllable noise, processed through a Variational Autoencoder (VAE) to generate latent variables.
- The denoising U-Net refines these variables to produce realistic images guided by text prompts.
- High-Quality: Lowest FID score compared to other methods, indicating better realistic styles.
- Consistency: Morphology preserving, the skeleton shape is well-kept in synthesized images.
For more details, visit the Project Page.
Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion
@article{wang2025diffusion, title={Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion}, author={Wang, Hao and Chen, Xiwen and Bastola, Ashish and Qin, Jiayou and Razi, Abolfazl}, journal={arXiv preprint arXiv:2501.00944}, year={2025} }