An end-to-end Computer Vision project focused on the topic of Image Segmentation (specifically Semantic Segmentation). Although this project has primarily been built with the LandCover.ai dataset, the project template can be applied to train a model on any semantic segmentation dataset and extract inference outputs from the model in a promptable fashion. Though this is not even close to actual promptable AI, the term is being used here because of a specific functionality that has been integrated here.
The model can be trained on any or all the classes present in the semantic segmentation dataset with the ability to customize the model architecture, optimizer, learning rate, and a lot more parameters directly from the config file, giving it an exciting AutoML aspect. Thereafter while testing, the user can pass the prompt (in the form of the config variable 'test_classes') of the selected classes that the user wants to be present in the masks predicted by the trained model.
For example, suppose the model has been trained on all the 30 classes of the CityScapes dataset and while inferencing, the user only wants the class 'parking' to be present in the predicted mask due to a specific use-case application. Therefore, the user can provide the prompt as 'test_classes = ['parking']' in the config file and get the desired output.
1. Training the model on LandCover.ai dataset with 'train_classes': ['background', 'building', 'woodland', 'water']...
2. Testing the trained model for all the classes used to train the model, i.e. 'test_classes': ['background', 'building', 'woodland', 'water']...
3. Testing the trained model for selective classes as per user input, i.e. 'test_classes': ['background', 'building', 'water']...
- Dataset prerequisite for training:
Before starting to train a model, make sure to download the dataset from LandCover.ai or from kaggle/LandCover.ai, and copy/move over the downloaded directories 'images' and 'masks' to the 'train' directory of the project.
- Python: 3.9 (Docker image uses
python:3.9) - PyTorch: 2.0.1, TorchVision: 0.15.2 (see
requirements.txt) - CUDA (optional, recommended): NVIDIA GPU with a supported CUDA toolkit/driver for PyTorch 2.0.1. CPU is supported but significantly slower.
- OS: Linux/Windows/macOS (Docker recommended for reproducibility)
To install a CUDA-enabled PyTorch that matches your NVIDIA driver, follow the official selector and install command from the PyTorch site. Example (Linux, CUDA 11.8):
pip install --index-url https://download.pytorch.org/whl/cu118 torch==2.0.1 torchvision==0.15.2If you do not have a compatible GPU/driver, install the CPU wheels instead:
pip install --index-url https://download.pytorch.org/whl/cpu torch==2.0.1 torchvision==0.15.2Note: This repository pins
torch==2.0.1andtorchvision==0.15.2inrequirements.txt.
- The runtime device is controlled via the config at
config/config.yamlwith keyvars.device. Default is:
vars:
device: "cuda" # set to "cpu" to force CPU-
Scripts use
torch.devicefrom this config. If CUDA is available anddevice: "cuda", training/inference will run on the GPU. Otherwise, setdevice: "cpu". -
Verify CUDA availability on your machine before running training/testing:
python testcuda.pyExpected output (example):
PyTorch version: 2.0.1
CUDA available: True
Device count: 1
Current device: 0
GPU name: NVIDIA GeForce RTX ...
If CUDA available: False, install the correct CUDA-enabled PyTorch wheel and ensure NVIDIA drivers are installed and compatible.
First and foremost, make sure that Docker is installed and working properly in the system.
π‘ Check the Dockerfile added in the repository. According the instructions provided in the file, comment and uncomment the mentioned lines to setup the docker image and container either to train or test the model at a time.
- Clone the repository:
git clone https://github.com/XaXtric7/Terra_Mask.git- Change to the project directory:
cd Land-Cover-Semantic-Segmentation-PyTorch- Build the image from the Dockerfile:
docker build -t segment_project_image .- Running the docker image in a docker container:
docker run --name segment_container -d segment_project_image- Copying the output files from the container directory to local project directory after execution is complete:
docker cp segment_container:/segment_project/models ./models
docker cp segment_container:/segment_project/logs ./logs
docker cp segment_container:/segment_project/output ./output- Tidying up:
docker stop segment_container
docker rm segment_container
docker rmi segment_project_imageIf Docker is not installed in the system, follow the below methods to set up and run the project without Docker.
- Clone the repository:
git clone https://github.com/XaXtric7/Terra_Mask.git- Change to the project directory:
cd Land-Cover-Semantic-Segmentation-PyTorch- Setting up programming environment to run the project:
- If using an installed conda package manager, i.e. either Anaconda or Miniconda, create the conda environment following the steps mentioned below:
conda create --name <environment-name> python=3.9
conda activate <environment-name>- If using a directly installed python software, create the virtual environment following the steps mentioned below:
python -m venv <environment-name>
<environment-name>\Scripts\activate- Install the dependencies:
pip install -r requirements.txt- (Optional) Select CPU or CUDA device in
config/config.yaml:
vars:
device: "cuda" # change to "cpu" if no GPURunning the model training and testing/inferencing scripts from the project directory. It is not necessary to train the model first mandatorily, as a simple trained model has been provided to run the test and check outputs before trying to fine-tune the model.
- Run the model training script:
cd src
python train.py- Run the model test (with images and masks) script:
cd src
python test.py- Run the model inference (with images only, masks not required) script:
cd src
python inference.py- Verify CUDA/GPU availability (optional but recommended):
python testcuda.pyIf CUDA is working, keep vars.device: "cuda". Otherwise, update to "cpu" in config/config.yaml.
All key hyperparameters and IO paths are controlled via config/config.yaml. Highlights:
dirs:
data_dir: data
train_dir: train
test_dir: test
image_dir: images
mask_dir: masks
model_dir: models
output_dir: output
pred_mask_dir: predicted_masks
pred_plot_dir: prediction_plots
log_dir: logs
vars:
file_type: ".tif"
patch_size: 256
batch_size: 4
model_arch: "Unet" # see: https://smp.readthedocs.io/en/latest/models.html
encoder: "efficientnet-b0" # see: https://smp.readthedocs.io/en/latest/encoders_timm.html
encoder_weights: "imagenet"
activation: "softmax2d" # sigmoid for binary, softmax2d for multi-class
optimizer_choice: "Adam"
init_lr: 0.0003
epochs: 20
device: "cuda" # set to "cpu" if no GPU
all_classes: ["background", "building", "woodland", "water", "road"]
train_classes: ["background", "building", "woodland", "water"]
test_classes: ["background", "building", "water"]@misc{XaXtric_7:2025,
author = {Sarthak Dharmik},
title = {Terra Mask},
year = {2025},
howpublished = {\url{https://github.com/XaXtric7/Terra_Mask}},
note = {GitHub repository},
publisher = {GitHub}
}
Project is distributed under MIT License
@misc{Iakubovskii:2019,
Author = {Pavel Iakubovskii},
Title = {Segmentation Models Pytorch},
Year = {2019},
Publisher = {GitHub},
Journal = {GitHub repository},
Howpublished = {\url{https://github.com/qubvel/segmentation_models.pytorch}}
}
@misc{Souvik:2023,
Author = {Souvik Majumder},
Title = {Land Cover Semantic Segmentation PyTorch},
Year = {2023},
Publisher = {GitHub},
Journal = {GitHub repository},
Howpublished = {\url{https://github.com/souvikmajumder26/Land-Cover-Semantic-Segmentation-PyTorch}}
}