A Universal Scale-Adaptive Deformable Transformer for Image Restoration across Diverse Artifacts (CVPR 2025)
abstract: Structured artifacts are semi-regular, repetitive patterns that closely intertwine with genuine image content, making their removal highly challenging. In this paper, we introduce the Scale-Adaptive Deformable Transformer, an network architecture specifically designed to eliminate such artifacts from images. The proposed network features two key components: a scale-enhanced deformable convolution module for modeling scale-varying patterns with abundant orientations and potential distortions, and a scale-adaptive deformable attention mechanism for capturing long-range relationships among repetitive patterns with different sizes and non-uniform spatial distributions. Extensive experiments show that our network consistently outperforms state-of-the-art methods in diverse artifact removal tasks, including image deraining, image demoireing, and image debanding.
The code has been test on 2 NVIDIA RTX 4090 GPUs.
torch==2.0.0+cu118
timm==0.9.16
einopsThe Scale-Adaptive Deformable Attention (SADA) module is partially implemented by CUDA extension now.
-
The sub-module for deformable sampling operation
-
Complie Successfully on ubuntu 23.04, cuda_11.8, gcc version 11.4.0, python 3.9.18(conda)
Install
$ CUDA_HOME=your_cuda_path python3 setup.py build developAnother version of the SADA module, which is fully implemented in PyTorch and has a faster speed than CUDA version, is placed in the root of the project named deformAttention_torch.py. You can try it by putting it into the corresponding model directory and replacing from .deformAttention import SADAttentionBlock, PConv with from .deformAttention_torch import SADAttentionBlock, PConv in SADT_archs.py file.
The current PyTorch version is implemented using torch.nn.functional.grid_sample function. While this function is well-suited for flow-based methods that typically sample a single offset per location, it is not an ideal fit for deformable methods, which inherently require sampling multiple points (e.g., from multiple offset predictions) simultaneously at each position. This architectural mismatch means the current implementation, though functional, may not fully leverage the potential efficiency or expressiveness of deformable attention mechanisms. I am keen to explore more native or performant implementations. If you have insights or code improvements, feel free to reach out. Welcome discussions and contributions!
-
Demoireing
- TIP2018: https://huggingface.co/datasets/zxbsmk/TIP-2018
- FHDMi: https://drive.google.com/drive/folders/1IJSeBXepXFpNAvL5OyZ2Y1yu4KPvDxN5?usp=sharing or https://pan.baidu.com/s/19LTN7unSBAftSpNVs8x9ZQ (jf2d)
- LCDMoire: https://competitions.codalab.org/competitions/20166 (the download link can not be available now.)
-
Deraining: https://stevewongv.github.io/derain-project.html (the download link can not be available now.)
| Task | Demoireing | Demoireing | Demoireing | Deraining | Debanding |
|---|---|---|---|---|---|
| Dataset | FHDMi | TIP18 | LCDMoire | SPAD | DID |
| Download Link | Download | Download | Download | Download | Download |
-
Go into the sub-folder:
$ cd Demoireing -
Download the corresponding pre-trained model and put it into
out_dir/xxx/exp_light/net_checkpointsfolder. -
cd model/DSv2and compile the CUDA extension into a Python-compatible module -
cd ../../ -
Download the dataset and open the configuration file
config/xxx_config.yaml, modify theTRAIN_DATASETandTEST_DATASETto your own data path. -
Run the test code:
$ python test.py --config/xxx_config.yaml-
Go into the sub-folder:
$ cd Deraining -
Download the corresponding pre-trained model and put it into
experiments\Deraining_SADT_spa\modelsfolder. -
cd basicsr/models/archs/DSv2and compile the CUDA extension into a Python-compatible module -
cd ../../../../, download the dataset and and run the test code:$ python test.py --input_dir your_input_path --gt_dir your_gt_path --result_dir your_result_dir
-
Go into the sub-folder:
$ cd Debanding -
Download the corresponding pre-trained model and put it into
experiments\SADT_debanding\modelsfolder. -
Download the dataset. The pristine dataset is not divided, you can divide the image pairs through the same strategy as us: "The dataset comprises 1440 pairs of Full High-Definition (FHD) images. Each image is initially divided into 256
$\times$ 256 patches with a step size of 128. After filtering out pairs which degraded image devoid of banding artifacts, the re maining pairs are divided into training (60%), validation (20%), and test (20%) sets while ensuring all patches from the same FHD image belong to the same set." -
cd codesand run the test code:$ python test.py -opt options/test/test_SADT.yml
Similar like Testing. Please see the previous section.
If you are interested in this work, please consider citing:
@inproceedings{he2025universal,
title={A Universal Scale-Adaptive Deformable Transformer for Image Restoration across Diverse Artifacts},
author={He, Xuyi and Quan, Yuhui and Xu, Ruotao and Ji, Hui},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={12731--12741},
year={2025}
}
This code is based on the DCNv2, ESDNet and DRSformer. Thanks for their awesome work.