Official implementation of Mask Image Watermarking.
We provide pre-trained model weights for inference. You can download them from the following link: Download Model Weights.
Specifically, the following two variants are available:
- D_32bits, D_64bits, D_128bits – for global watermark embedding with different bits.
- ED_32bits, ED_64bits, ED_128bits – for adaptive local watermark embedding based on the mask with different bits.
We also provide enhanced fine-tuned variants in https://huggingface.co/Runyi-Hu/MaskWM/tree/main that are specially optimized for robustness under different types of distortions (e.g., VAE, Move and Resize, Crop and Resize). These models are designed to improve watermark extraction performance in more challenging or degraded visual conditions.
After downloading, place the weights into the checkpoints/ directory.
python3 inference.py \
--device "cuda:0" \
--model_name "D_32bits" \
--image_name "00"This command performs the following steps:
-
Global Watermark Embedding. A full-image watermark is embedded into the specified image.
-
Masked Fusion. Using a predefined mask, only the masked region retains the watermark, while the rest of the image is replaced by the original image. This creates a fused image that contains both watermarked and clean areas.
-
Watermark Localization & Extraction. The model then performs watermark localization and extraction on the fused image.
All results will be saved in the results/D_32bits directory.
python3 inference.py \
--device "cuda:0" \
--model_name "ED_32bits" \
--image_name "00"Unlike MaskWM-D, this command enables adaptive local watermark embedding during generation. The watermark is primarily embedded within the mask-selected region, while the rest of the image is designed to contain minimal or no watermark signal.
All results will be saved in the results/ED_32bits directory.
- Download the coco 2014 data.
- Expected directory structure:
data/ └── coco_data/ ├── annotations/ ├── train2014/ └── val2014/
- Set dataset path: Modify the
dataset_pathfield inconfigs/train/train.yamlto point to your local dataset directory. For example:dataset_path: data/coco_data
- Select model variant: You can train different models by changing the model_name in the config file. Supported options: D_32bits, D_64bits, D_128bits, ED_32bits, ED_64bits, ED_128bits.
- Start training with the following command (D_32bits):
Logs and checkpoints are saved in the
python3 train.py \ --model_name "D_32bits" \ --train_config_path "configs/train/train.yaml" \ --model_config_path "configs/model/D_32bits.yaml"checkpoints/<model_name>directory.
In practical scenarios, you might want to enhance the model's robustness against specific distortions based on your application needs. We provide dedicated finetuning scripts to support this.
Specifically, in Step 3 above, you can simply replace train.yaml with finetune_<distortion_name>.yaml to perform robustness-oriented finetuning.
We provide two example config files for finetuning: finetune_vae.yaml and finetune_crop&resize.yaml.
Key parameters to be aware of:
- num_training_steps: We recommend setting this between 20,000 and 50,000, depending on the scale and difficulty of your task.
- ED_path: Path to the pretrained model you want to finetune.
- ft_noise_layers: A list of distortion types you want the model to become robust against.
- full_mask_ft: Whether to use a global mask during finetuning. For example, for Crop & Resize, we enable this option since the distortion typically preserves only part of the watermarked image — similar to the behavior of a local mask.
This flexible finetuning setup allows for targeted robustness improvements tailored to your deployment environment.
If you find this repository useful, please consider giving a star ⭐ and please cite as:
@article{hu2025mask,
title={Mask Image Watermarking},
author={Hu, Runyi and Zhang, Jie and Zhao, Shiqian and Lukas, Nils and Li, Jiwei and Guo, Qing and Qiu, Han and Zhang, Tianwei},
journal={arXiv preprint arXiv:2504.12739},
year={2025}
}