Compared with Siamese-Diffusion, the proposed approach slightly increases training memory usage due to the joint training of a teacher model, which provides stronger prior guidance for the student model, ultimately accelerating its convergence.
Even without using the plug-and-play Online-Augmentation module from Siamese-Diffusion, our method achieves competitive performance on Polyp-PVT.
The usual installation steps involve the following commands, they should set up the correct CUDA version and all the python packages:
conda create -n Siamese-Diffusion python=3.10
conda activate Siamese-Diffusion
conda install pytorch==2.4.0 torchvision==0.19.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118
pip install deepspeedWe evaluated our method on three public datasets: Polyps (as provided by the PraNet project), and Kidney Tumor.
--data
--images
--masks
--prompt.json💡 Note: The core contribution code has been integrated into cldm.py. Corresponding improvements to the model architecture should be made in ldm/modules/diffusionmodules/openaimodel.py, and the configuration must be updated in models/cldm_v15.yaml.
A new initialization scheme has been implemented in tool_add_control.py to accommodate changes in the model architecture and generate control_sd15.ckpt.
Inspired by Siamese-Diffusion, the DHI module is integrated as a default component.
Here are example commands for training:
# Initialize ControlNet with the pretrained UNet encoder weights from Stable Diffusion,
# then merge them with Stable Diffusion weights and save as: control_sd15.ckpt
python tool_add_control.py
# For multi-GPU setups, ZeRO-2 can be used to train Siamese-Diffusion
# to reduce memory consumption.
python tutorial_train.pyHere are example commands for sampling:
# ZeRO-2 distributed weights are saved under the folder:
# lightning_logs/version_#/checkpoints/epoch/
# Run the following commands to merge the weights:
python zero_to_fp32.py . pytorch_model.bin
python tool_merge_control.py
# Sampling
python tutorial_inference.pyThis repository is developed based on ControlNet and Siamese-Diffusion. It further integrates several state-of-the-art segmentation models, including nnUNet, SANet, and Polyp-PVT.
If you find our work useful in your research or if you use parts of this code, please consider citing our paper:
@article{qiu2025adaptively,
title={Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis},
author={Qiu, Kunpeng and Zhou, Zhiying and Guo, Yongxin},
journal={arXiv preprint arXiv:2507.23652},
year={2025}
}
@article{qiu2025noise,
title={Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation},
author={Qiu, Kunpeng and Gao, Zhiqiang and Zhou, Zhiying and Sun, Mingjie and Guo, Yongxin},
journal={arXiv preprint arXiv:2505.06068},
year={2025}
}