Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

In ICCV 2025

Du Chen^1,2* · Liyi Chen^1* · Zhengqiang Zhang^1,2 · Lei Zhang^1,2†

¹The Hong Kong Polytechnic University ²OPPO Research Institute
*Equal contribution †Corresponding author

🎉 News

2025-06-25: GSASR is accecpted by ICCV 2025. Congratulations! We will modify our final version these days.
2025-06-05: The online demo with most powerful HATL-based GSASR is released, click to try it.
2025-05-30: The {EDSR, RDN, SWIN, HATL}-based GSASR models are available.
2025-01-16: GSASR paper and project papge are released.

This work presents GSASR. It achieve SoTA in arbitrary-scale super-resolution by representing given LR image as millions of continuous 2D Gaussians.

Encoder Backbone	Methods	Version	Training Dataset	PSNR/SSIM/LPIPS/DISTS (x4 scaling factor)
Encoder Backbone	Methods	Version	Training Dataset	DIV2K	LSDIR	Urban100
EDSR	LIIF	Paper	DIV2K	30.43/0.8388/0.2662/0.1403	26.21/0.7614/0.2978/0.1678	26.14/0.7885/0.2271/0.1738
	GaussianSR	Paper	DIV2K	30.46/0.8389/0.2684/0.1406	26.23/0.7615/0.3007/0.1679	26.19/0.7893/0.2283/0.1730
	CiaoSR	Paper	DIV2K	30.67/0.8431/0.2585/0.1370	26.42/0.7681/0.2865/0.1631	26.69/0.8091/0.2078/0.1659
	GSASR	Paper Reported	DIV2K	30.89/0.8486/0.2518/0.1301	26.65/0.7774/0.2777/0.1554	27.01/0.8142/0.1987/0.1552
	GSASR	Enhanced	DIV2K	31.01/0.8509/0.2508/0.1306	26.78/0.7813/0.2962/0.1543	27.34/0.8230/0.1920/0.1515
	GSASR	Enhanced	DF2K	31.04/0.8515/0.2512/0.1307	26.82/0.7827/0.2751/0.1540	27.45/0.8256/0.1902/0.1507
RDN	LIIF	Paper	DIV2K	30.71/0.8449/0.2566/0.1354	26.48/0.7714/0.2838/0.1603	26.71/0.8055/0.2062/0.1562
	GaussianSR	Paper	DIV2K	30.76/0.8457/0.2570/0.1347	26.53/0.7727/0.2837/0.1595	26.77/0.8064/0.2069/0.1610
	CiaoSR	Paper	DIV2K	30.91/0.8481/0.2525/0.1327	26.66/0.7770/0.2768/0.1563	27.10/0.8142/0.1966/0.1559
	GSASR	Paper Reported	DIV2K	30.96/0.8500/0.2505/0.1288	26.73/0.7801/0.2752/0.1533	27.15/0.8177/0.1953/0.1515
	GSASR	Enhanced	DIV2K	31.03/0.8513/0.2499/0.1306	26.79/0.7819/0.2740/0.1543	27.37/0.8238/0.1898/0.1511
	GSASR	Enhanced	DF2K	31.10/0.8525/0.2482/0.1296	26.88/0.7848/0.2709/0.1527	27.58/0.8289/0.1849/0.1500
SWIN	CiaoSR	Paper	DIV2K	31.05/0.8511/0.2487/0.1316	26.80/0.7812/0.2724/0.1552	27.40/0.8231/0.1869/0.1535
	GSASR	Paper (not Reported)	DIV2K	31.06/0.8521/0.2487/0.1270	26.84/0.7837/0.2719/0.1503	27.39/0.8247/0.1913/0.1466
	GSASR	Enhanced	DIV2K	31.10/0.8530/0.2463/0.1285	26.88/0.7849/0.2690/0.1517	27.55/0.8280/0.1850/0.1475
	GSASR	Enhanced	DF2K	31.17/0.8541/0.2456/0.1288	26.96/0.7876/0.2665/0.1513	27.81/0.8343/0.1781/0.1465
HATL	GSASR	Ultra Performance	SA1B	31.31/0.8570/0.2381/0.1268	27.17/0.7948/0.2548/0.1470	28.44/0.8493/0.1580/0.1394

Comparisons with representative/SoTA ASR models (PSNR/SSIM are tested on Y channel of Ycbcr space).

We provide three versions of GSASR:

Paper: the results we reported in our paper. (not reported) means results are not shown in our paper due to limited pages.
Enhanced: we introduce Rotary Position Embedding (ROPE) with Flash Attention, and utilize Automatic Mixed Precision (AMP) strategy during training/inference to to reduce memory and time cost.
Ultra Performance: based on Enhanced settings, we explore the performance upper bound of GSASR by introducing HAT-L encoder and SA1B dataset.

⚙️ Pre-trained Models (Enhanced and Ultra Performance Version)

Model Backbone	Training Dataset	Download	Version
EDSR	DIV2K	Google Drive, Hugging Face	Enhanced
EDSR	DF2K	Google Drive, Hugging Face	Enhanced
RDN	DIV2K	Google Drive, Hugging Face	Enhanced
RDN	DF2K	Google Drive, Hugging Face	Enhanced
SWIN	DIV2K	Google Drive, Hugging Face	Enhanced
SWIN	DF2K	Google Drive, Hugging Face	Enhanced
HATL	SA1B	Google Drive, Hugging Face	Ultra Performance

Toward the results of our paper, we do not use these tricks (AMP+ROPE+Flash Attention/extra training datasets) for fair comparison.

As for the pretrained models reported in our paper, please refer to Pre-trained Models (Paper Version).

🔧 Usage

Prepraration

Pytorch == 2.0 (PyTorch Version must >= 2.0)
Anaconda
CUDA Toolkit (necessary)

Firstly, please make sure you have installed CUDA Toolkit! Since we have hand-crafted CUDA operators, you need to compile them when you run GSASR.

git clone https://github.com/ChrisDud0257/GSASR
cd GSASR
conda create --name gsasr python=3.10
conda activate gsasr
export CUDA_HOME=${path_to_CUDA} ### specify the path to cuda-11.8
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
python setup_gscuda.py install # gscuda
cd TrainTestGSASR
pip install -r requirements.txt
BASICSR_EXT=True python setup_basicsr.py develop # basicsr

We have tested that the versions of CUDA from 11.0 to 12.4 are all OK.

Runing

You need to properly authenticate with Hugging Face to download our model weights. Once set up, our code will handle it automatically at your first run. You can authenticate by running

# This will prompt you to enter your Hugging Face credentials.
huggingface-cli login

You can try GSASR easily by lanching gradio demo or runing in command.

🚀 Gradio demo

python demo_gr.py

💻 CLI

python inference_enhenced.py \
    --input_img_path <path_to_img> \
    --save_sr_path <path_to_saved_folder> \
    --model <{EDSR_DIV2K, EDSR_DF2K, RDN_DIV2K, RDN_DF2K, SWIN_DIV2K,SWIN_DF2K, HATL_SA1B}> \
    --scale <scale> [--tile_process] [--AMP_test]

If it fails to access Huggingface, try to manually download pretrained models and specify local path with --model_path <path_to_model_weight>.

Using --tile_process and --AMP_test if memory is limited.

📏 Pre-trained Models (Paper Version)

Please note that, in our paper, we only train GSASR on DIV2K dataset without AMP+ROPE+Flash Attention tricks for fair comparison. Due to the limited pages in our paper, we don't report the results of Swin-based model. Here, besides {EDSR, RDN}-based GSASR present in paper, we further provide the Swin-based GSASR model. The {EDSR, RDN}-based GSASR models provided bellow should exactly generate the same results as that reported in our paper (Table.1 in the main paper and Table.1 ~ Table.7 in the supplementary).

Download Pre-trained models (Paper Version)

Download models from the following link.

Encoder Backbone	Training Dataset	Download	Version
EDSR	DIV2K	Google Drive, Hugging Face	Paper Reported
RDN	DIV2K	Google Drive, Hugging Face	Paper Reported
SWIN	DIV2K	Google Drive, Hugging Face	Paper (not Reported)

Inference for single image

if you have logined in the huggingface, directly execute the inference_paper.py as follows.

python inference_paper.py \
    --input_img_path <path_to_img> \
    --save_sr_path <path_to_saved_folder> \
    --model <{EDSR, RDN, SWIN}> \
    --scale <scale> [--tile_process]

Inference on standard benchmark

To get the numerical performance in Tab.2 of the main paper. Please download cropped 720*720 size of GT images, and the corresponding LR images of DIV2K testing parts, which are utilized in our paper.

The {EDSR, RDN}-based GSASR models provided bellow should exactly generate the same PSNR/SSIM/LPIPS/DISTS results as that reported in our paper (Table.2 in the main paper and Table.8 the supplementary).

Dataset	Download
DIV2K_GT720	Google Drive

If you want to crop images all by yourself, please follow this instruction to prepare the data which could be utilized to test the computational cost.

After you download them, please test by the following command.

python inference_paper_benchmark.py \
    --input_img_path <path_to_LRx4_folder> \
    --save_sr_path <path_to_saved_folder> \
    --model <{EDSR, RDN, SWIN}> \
    --scale 4 [--tile_process]

Please indicate the "input_img_path" to your downloaded DIV2K testing parts (which is provided by us).

If you want to test GSASR on standard benchmarks with full size, please use the same commands as above. We also provide the widely-used testing benchmarks, including Set5, Set14, DIV2K-val 100, LSDIR-val 250, Urban100, Manga109, BSDS100, General100, and each GT image's corresponding LR counterparts with different scaling factors which is obtained by bicubic operation.

Dataset	Link
Testing Benchmarks	Google Drive

Memory and inference time estimation

In inference_paper_benchmarks.py, we integrate the statistics code of test time (ms) and GPU memory (MB). In our paper, we calculate the computational cost on a single NVIDIA A100 GPU, and we input the full size image into the model, we don't use tile_process. The inference time omit the pre-processing and post-processing and record the full pipeline cost inlcuding encoder, decoder and rendering.

Metrics

After inference, execute the code to estimate PSNR/SSIM/LPIPS/DISTS.

cd TrainTestGSASR/scripts/metrics/
python calculate_psnr_ssim.py --test_y_channel --gt <path_to_GT_folder> --restored <path_to_SR_folder> --scale <scale> [--suffix <suffix_of_images>]
python calculate_lpips.py  --gt <path_to_GT_folder> --restored <path_to_SR_folder> --scale <scale> [--suffix <suffix_of_images>]
python calculate_dists.py  --gt <path_to_GT_folder> --restored <path_to_SR_folder> --scale <scale> [--suffix <suffix_of_images>]

Please note that we test them on Y channel of Ycbcr space with --test_y_channel when calculating PSNR/SSIM. When calculating PSNR/SSIM/LPIPS/DISTS, we set crop_border=${scale} if the scaling factor is not larger than 8, otherwise crop_border=8.

🗝️ Training and Testing

Dataset preparation

Please follow this instruction to prepare the training and testing datasets.

Training GSASR

Please follow this instruction to train GSASR.

Testing GSASR

Please follow this instruction to test GSASR if you further want to do it .

🙏 Acknowlegement

This project is built mainly based on the excellent BasicSR, HAT and ROPE-ViT codeframe. We appreciate it a lot for their developers.

We sincerely thank Mr.Zhengqiang Zhang for his support in the CUDA operator of rasterization.

📚 Citation

If you find this research helpful for you, please cite our paper.

@article{chen2025generalized,
  title={Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution},
  author={Chen, Du and Chen, Liyi and Zhang, Zhengqiang and Zhang, Lei},
  journal={arXiv preprint arXiv:2501.06838},
  year={2025}
}

📧 Contact

If you have any questions or suggestions about this project, please contact me at [email protected] .

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
TrainTestGSASR		TrainTestGSASR
assets		assets
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo_gr.py		demo_gr.py
inference_enhenced.py		inference_enhenced.py
inference_paper.py		inference_paper.py
inference_paper_benchmark.py		inference_paper_benchmark.py
setup_gscuda.py		setup_gscuda.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

In ICCV 2025

🎉 News

⚙️ Pre-trained Models (Enhanced and Ultra Performance Version)

🔧 Usage

Prepraration

Runing

🚀 Gradio demo

💻 CLI

📏 Pre-trained Models (Paper Version)

Download Pre-trained models (Paper Version)

Inference for single image

Inference on standard benchmark

Memory and inference time estimation

Metrics

🗝️ Training and Testing

Dataset preparation

Training GSASR

Testing GSASR

🙏 Acknowlegement

📚 Citation

📧 Contact

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

ChrisDud0257/GSASR

Folders and files

Latest commit

History

Repository files navigation

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

In ICCV 2025

🎉 News

⚙️ Pre-trained Models (Enhanced and Ultra Performance Version)

🔧 Usage

Prepraration

Runing

🚀 Gradio demo

💻 CLI

📏 Pre-trained Models (Paper Version)

Download Pre-trained models (Paper Version)

Inference for single image

Inference on standard benchmark

Memory and inference time estimation

Metrics

🗝️ Training and Testing

Dataset preparation

Training GSASR

Testing GSASR

🙏 Acknowlegement

📚 Citation

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages