Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

Early-accept at MICCAI 2026 (top 9% of 4,601 submissions).

A quality-guided approach to semi-supervised medical image segmentation. A dedicated segmentation quality predictor g_φ is trained on variable-quality masks to estimate segmentation quality from image-mask pairs without requiring ground truth. The frozen g_φ then guides semi-supervised training of the segmentation network f_θ on unlabeled data through two complementary mechanisms:

QAR: a differentiable quality-aware regularization that encourages f_θ to produce masks that g_φ scores as high quality, or
PL-QW: quality-based pseudolabel reweighting that uses g_φ's score of each pseudolabel as a sample weight during f_θ training.

Both mechanisms are drop-in enhancements to existing semi-supervised learning frameworks.

Overview

Repository Structure

Training is organized around two phases:

[Phase 1] Train g_φ : generate Variable Quality Masks and train the quality predictor. Done using train_mq.sh.
[Phase 2] Train f_θ : use the frozen g_φ to guide semi-supervised training of the segmentation network. Done using train_SSL_methods.sh.

.
├── requirements.txt          # Python dependencies
├── train_mq.sh               # Phase 1: train g_φ and weak models
├── test_mq.sh                # Evaluate trained g_φ
├── train_SSL_methods.sh      # Phase 2: semi-supervised training of f_θ
│
├── prepare_datasets/         # Resize images/masks and generate train/val/test CSVs for labeled datasets
│   ├── utils.py              # Shared resize utility for image-mask pairs
│   ├── prepare_PH2.py        # Prepares PH2: resize and generate split CSVs
│   └── PH2_segs_metadata/    # Precomputed train/val/test split CSVs for PH2
│
├── prepare_unsupervised_images/            # Resize and generate metadata CSVs for unlabeled datasets
│   ├── create_isic2020_subsets.py          # Creates stratified ISIC2020 subsets (1k–30k)
│   ├── create_colon_unlabeled_metadata.py  # Generates metadata CSV for unlabeled colonoscopy images
│   ├── resize_isic2020_images.py           # Resizes ISIC2020 images to 224×224
│   ├── resize_colon_unlabeled_images.py    # Resizes unlabeled colonoscopy images to 224×224
│   ├── colonoscopy_unlabeled_metadata.csv  # Precomputed metadata CSV for unlabeled colonoscopy images
│   └── ISIC2020_train_subsets/             # Precomputed ISIC2020 subset CSVs (1k, 5k, 10k, 20k, 30k)
│
├── data/                          # Phase 1: VQM generation for g_φ training
│   ├── corruption_ops.py          # Synthetic mask corruption operations
│   ├── vqm_generator.py           # Variable Quality Mask (VQM) generator
│   ├── weak_model_corruption.py   # Weak model predictions as corruptions
│   ├── mq_dataset.py              # Dataset for g_φ training
│   └── seg_datasets.py            # D_L and D_U datasets for f_θ training
│
├── models/
│   ├── quality_predictor.py  # g_φ: (image, mask) → predicted Dice score
│   ├── ema.py                # EMA wrapper for f_θ (mean teacher methods)
│   ├── swin_unet.py          # Swin-UNet architecture for f_θ
│   └── swin_unet_utils.py    # Pretrained Swin Transformer weight loading
│
└── training/
    ├── train_mq.py           # Phase 1: g_φ training script
    ├── test_mq.py            # g_φ evaluation (MAE, Pearson correlation)
    ├── train_semisup.py      # Phase 2: semi-supervised training of f_θ
    ├── losses.py             # Loss functions (Dice, CE, combo, SmoothL1)
    ├── metrics.py            # Dice, Jaccard, pixel accuracy, F1
    ├── schedulers.py         # Unsupervised loss weight ramp-up scheduler
    ├── eval_utils.py         # Validation and test evaluation utilities
    └── methods/
        └── ours.py           # QAR and PL-QW implementations

Usage

Before running, update the dataset CSV paths and (if using Swin-UNet) the pretrained weights path at the top of each shell script.

Phase 1: Train and evaluate the segmentation quality predictor (g_φ):

bash train_mq.sh
bash test_mq.sh

Phase 2: Train and evaluate the segmentation models (f_θ) using semi-supervised methods:

bash train_SSL_methods.sh

Citation

If you find our paper useful, please consider citing:

Abhishek, K., Hamarneh, G. (2026). Quality-Guided Semi-Supervised Learning for Medical Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2026. MICCAI 2026. Lecture Notes in Computer Science. Springer, Cham.

Or in BibTeX format:

@inproceedings{abhishek2026qgssl,
  title     = {Quality-Guided Semi-Supervised Learning for Medical Image Segmentation},
  author    = {Abhishek, Kumar and Hamarneh, Ghassan},
  booktitle = {Medical Image Computing and Computer Assisted Intervention (MICCAI)},
  year      = {2026},
  publisher = {Springer}
}

Installation

Click to expand

Requirements:

Python 3.10
PyTorch 2.9.0

Dependencies:

pip install -r requirements.txt

Model implementations:

All g_φ backbones are sourced from timm.
f_θ architectures:
- U-Net++ is sourced from segmentation-models-pytorch.
- Attention U-Net is sourced from a yet-to-be-merged PR in segmentation-models-pytorch.
- Swin-UNet is a custom implementation in models/swin_unet.py and with utilities in models/swin_unet_utils.py. I am planning to open a PR to add Swin-UNet to segmentation-models-pytorch and will update the link once the PR is opened.
  - Swin-UNet requires pretrained Swin Transformer weights. Download swin_tiny_patch4_window7_224_22k.pth from the official Swin Transformer repository (URL) and place it at: checkpoints/pretrained/swin_tiny_patch4_window7_224_22k.pth. This path is configured by SWIN_PRETRAINED_PATH in train_SSL_methods.sh.

Datasets

Click to expand

The experiments use the following labeled datasets (D_L) and unlabeled datasets (D_U):

Labeled datasets:

Dermatology:
- PH2: [Download]
- SCD (Skin Cancer Detection): [Download]
- DermoFit: [Download; requires academic license]
Colonoscopy:
- CVC-ColonDB: [Download]
- CVC-ClinicDB: [Download]

Unlabeled datasets:

Dermatology:
- ISIC2020 (training split only; JPEG format): [Download]
Colonoscopy:
- Polyp-Box-Seg: [Download]

Once downloaded, organize and preprocess each dataset into train/val/test CSV files listing image and mask paths. The shell scripts expect these CSVs at paths configured by the variables at the top of each script; e.g., TRAIN_CSV, VAL_CSV, TEST_CSV, UNLABELED_CSV_5K (for 5k samples from ISIC2020, serving as D_U in our dermatology experiments).

See prepare_datasets/ for PH2 example. The prepare_unsupervised_images/ directory contains both ISIC2020 and Polyp-Box-Seg scripts and metadata.

Pretrained Weights

Click to expand

Pre-trained weights for segmentation models f_θ trained with QAR and PL-QW (see `training/methods/ours.py`) will soon be uploaded to the HuggingFace Hub. I will update the link once the weights are uploaded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

Overview

Repository Structure

Usage

Phase 1: Train and evaluate the segmentation quality predictor (g_φ):

Phase 2: Train and evaluate the segmentation models (f_θ) using semi-supervised methods:

Citation

Installation

Datasets

Pretrained Weights

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
models		models
prepare_datasets		prepare_datasets
prepare_unsupervised_images		prepare_unsupervised_images
training		training
.gitignore		.gitignore
LICENSE		LICENSE
Overview.png		Overview.png
README.md		README.md
requirements.txt		requirements.txt
test_mq.sh		test_mq.sh
train_SSL_methods.sh		train_SSL_methods.sh
train_mq.sh		train_mq.sh

Folders and files

Latest commit

History

Repository files navigation

Quality-Guided Semi-Supervised Learning for Medical Image Segmentation

Overview

Repository Structure

Usage

Phase 1: Train and evaluate the segmentation quality predictor (gφ):

Phase 2: Train and evaluate the segmentation models (fθ) using semi-supervised methods:

Citation

Installation

Datasets

Pretrained Weights

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Phase 1: Train and evaluate the segmentation quality predictor (g_φ):

Phase 2: Train and evaluate the segmentation models (f_θ) using semi-supervised methods: