This repository contains a state-of-the-art glacier semantic segmentation implementation based on DeepLab V3+ with Convolutional Block Attention Module (CBAM) and Test-Time Augmentation (TTA) for the GlacierHack 2025 competition.
The model performs binary semantic segmentation to distinguish glacier pixels from non-glacier pixels in multispectral satellite imagery using 5 spectral bands: B2 (Blue), B3 (Green), B4 (Red), B6 (SWIR), and B10 (TIR1).
- Backbone: ResNet-34 encoder for feature extraction
- ASPP with CBAM: Atrous Spatial Pyramid Pooling enhanced with Convolutional Block Attention Module
- Decoder: Depthwise separable convolutions for efficient upsampling
- Skip Connections: Low-level features fusion for better boundary preservation
- Channel Attention: Focuses on important feature channels
- Spatial Attention: Emphasizes relevant spatial locations
- Integration: Applied to each ASPP branch for enhanced feature representation
- Transformations: Original, horizontal flip, vertical flip, both flips
- Ensemble: Averages predictions from all augmented versions
- Robustness: Improves model consistency and performance
Google Colab notebook for model training
Features:
- Complete training pipeline from data loading to model evaluation
- Data augmentation with albumentations
- Combined BCE + Dice loss for handling class imbalance
- Comprehensive metrics: MCC, IoU, F1-score, R²
- Training visualization and model checkpointing
- Model compression to stay under 200MB limit
- TTA implementation and evaluation
Key Sections:
- Model architecture definition (Attention DeepLab V3+ with CBAM)
- Custom dataset class for .npy file handling
- Training loop with early stopping
- Validation and testing with/without TTA
- Results visualization and model saving
Competition inference script
Features:
- Follows exact competition requirements
- Loads pre-trained model weights (
model.pth) - Processes multi-band TIFF images
- Implements TTA for robust predictions
- Saves binary masks in required format
Key Functions:
AttentionDeepLabV3Plus: Complete model architectureTTAWrapper: Test-time augmentation implementationmaskgeration: Main inference functionload_and_preprocess_image: Image preprocessing pipeline
- Input:
slice_x_image_y.npy(512×512×15) → Extract bands [1,2,3,5,9] → (512×512×5) - Target:
slice_x_mask_y.npy(512×512×3) → Compress to (512×512×1)- Channel 0: Clean-ice glacier
- Channel 1: Debris-covered glacier
- Channel 2: HKH region mask
- Final target: (clean-ice OR debris-covered) AND hkh_region
- Structure:
dataset/ Band1/ img001.tif img002.tif ... Band2/ ... Band5/ ...
-
Setup Environment:
!pip install torch torchvision tqdm opencv-python pillow scikit-learn numpy tifffile segmentation-models-pytorch !pip install albumentations matplotlib seaborn
-
Mount Google Drive:
from google.colab import drive drive.mount('/content/drive')
-
Update Data Path:
DATA_DIR = '/content/drive/MyDrive/glacier_data' # Update this path
-
Run Notebook: Execute all cells in
glacier_segmentation_training.ipynb
-
Install Dependencies:
pip install torch torchvision tqdm opencv-python pillow scikit-learn numpy tifffile albumentations
-
Run Inference:
python solution.py --data /path/to/test/dataset --masks /path/to/masks --out /path/to/output
- Primary: Matthews Correlation Coefficient (MCC) - optimized for balanced performance
- Secondary: IoU, F1-score, R² for comprehensive evaluation
- MCC: >0.85 on validation set
- IoU: >0.80 for glacier segmentation
- F1-score: >0.90 for binary classification
- Model Size: <200MB (competition requirement)
Combined Loss = α × BCE Loss + (1-α) × Dice Loss- Handles class imbalance effectively
- BCE for pixel-wise classification
- Dice for overlap optimization
- Horizontal/Vertical flips
- 90° rotations
- Brightness/Contrast adjustments
- Gaussian noise injection
Selected 5 most informative bands for glacier detection:
- Band 1 (B2): Blue - Snow/ice detection
- Band 2 (B3): Green - Vegetation contrast
- Band 3 (B4): Red - Rock/soil discrimination
- Band 4 (B6): SWIR - Ice/snow separation
- Band 5 (B10): TIR - Thermal signature
- State dict only (no optimizer)
- Post-training quantization if needed
- Weight pruning for size optimization
✅ Script Name: solution.py
✅ Function Name: maskgeration(imagepath, out_dir)
✅ Model Weights: model.pth (<200MB)
✅ Dependencies: Listed in script header
✅ Output Format: Binary TIFF masks
✅ File Naming: Matches input filenames
✅ Arguments: --data, --masks, --out
Input (5 bands) → ResNet-34 Encoder → ASPP + CBAM → Decoder → Binary Mask
↓ ↓ ↓
Low-level feat. High-level feat. Skip Connection
↓ ↓ ↓
Projection Attention Upsampling
↓ ↓ ↓
└─────────→ Concatenate �────────┘
-
DeepLab V3+: Chen, L. C., et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." ECCV 2018.
-
CBAM: Woo, S., et al. "CBAM: Convolutional block attention module." ECCV 2018.
-
Glacier Extraction: Chu, X., et al. "Glacier extraction based on high spatial resolution remote sensing images using a deep learning approach with attention mechanism." The Cryosphere 2022.
@article{glacier_segmentation_2025,
title={Glacier Semantic Segmentation with Attention DeepLab V3+},
author={Your Name},
journal={GlacierHack 2025},
year={2025}
}For questions or issues, please contact: [[email protected]]
Note: This implementation is designed specifically for the GlacierHack 2025 competition requirements and follows all submission guidelines.
torch>=1.9.0 torchvision>=0.10.0
numpy>=1.21.0 pillow>=8.3.0 opencv-python>=4.5.0 scikit-learn>=1.0.0 tifffile>=2021.7.2
albumentations>=1.1.0 matplotlib>=3.4.0 seaborn>=0.11.0
tqdm>=4.62.0