In this paper, we introduce SAM3-UNet, a simplified variant of Segment Anything Model 3 (SAM3), designed to adapt SAM3 for downstream tasks at a low cost. Our SAM3-UNet consists of three components: a SAM3 image encoder, a simple adapter for parameter-efficient fine-tuning, and a lightweight U-Net-style decoder. Preliminary experiments on multiple tasks, such as mirror detection and salient object detection, demonstrate that the proposed SAM3-UNet outperforms the prior SAM2-UNet and other state-of-the-art methods, while requiring less than 6 GB of GPU memory during training with a batch size of 12.
git clone https://github.com/WZH0120/SAM3-UNet.git
cd SAM3-UNet/You can refer to the following repositories and their papers for the detailed configurations of the corresponding datasets.
Please refer to SAM 3.
If you want to train your own model, please download the pre-trained sam3.pt according to official guidelines. After the above preparations, you can run train.sh to start your training.
Our pre-trained models and prediction maps can be found at Google Drive. Also, you can run test.sh to obtain your own predictions.
After obtaining the prediction maps, you can run eval.sh to get the quantitative results. For the evaluation of mirror detection, please refer to eval.py in HetNet to obtain the results.
Please cite the following paper and star this project if you use this repository in your research. Thank you!