Nextflow pipeline wrapper for Classpose WSI cell classification.
- Simple CSV samplesheet input (just slide paths)
- Automatic OME-TIFF conversion to OpenSlide-compatible format
- Pre-built Docker container with conic and consep models included
- Support for Docker, Singularity, and Apptainer
- GPU acceleration support
- Automatic OME-TIFF conversion to OpenSlide-compatible format
- DRS URI support for downloading files from Gen3/NCI CRDC
flowchart TD
subgraph Input
A[Samplesheet CSV]
end
subgraph "Source Resolution"
A --> B{DRS URI?}
B -->|Yes| C[GEN3_DOWNLOAD]
B -->|No| D[Local File]
C --> E[Downloaded File]
end
subgraph "Format Conversion"
D --> F{OME-TIFF?}
E --> F
F -->|Yes| G[VIPS_CONVERT]
F -->|No| H[Passthrough]
G --> I[Converted TIFF]
end
subgraph "Inference"
H --> J[CLASSPOSE_PREDICT_WSI]
I --> J
end
subgraph Output
J --> K[Cell Contours GeoJSON]
J --> L[Cell Centroids GeoJSON]
J --> M[Tissue Contours GeoJSON]
J --> N[Cell Densities CSV]
J --> O[SpatialData Zarr]
end
# Run with Docker
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
-profile docker
# Run with GPU support
nextflow run main.nf \
--input samplesheet.csv \
--outdir results \
-profile docker,gpu- Nextflow (>=23.04.0)
- Docker or Singularity or Apptainer
The pipeline uses multiple containers:
| Container | Description |
|---|---|
ghcr.io/adamjtaylor/nf-classpose:latest |
Main classpose inference container |
ghcr.io/adamjtaylor/nf-classpose-vips:main |
VIPS for OME-TIFF conversion |
ghcr.io/adamjtaylor/nf-classpose-gen3:latest |
Gen3 client for DRS downloads |
Create a CSV file with slide paths:
slide_path
/data/slide1.svs
/data/slide2.ome.tiff
/data/slide3.ndpiOr with optional custom sample IDs:
id,slide_path
patient_001,/data/slide1.svs
patient_002,/data/slide2.ome.tiff
patient_003,/data/slide3.ndpi| Column | Required | Description |
|---|---|---|
slide_path |
Yes | Path to WSI file (.svs, .tiff, .ndpi, etc.) or DRS URI |
id |
No | Custom sample ID (if not provided, derived from filename) |
Sample IDs are automatically derived from the slide filename (e.g., slide1.svs → slide1) unless an id column is provided.
The pipeline supports DRS (Data Repository Service) URIs for downloading files from Gen3-based repositories like NCI CRDC:
slide_path
drs://nci-crdc.datacommons.io/dg.4DFC/624693b0-7e68-11ee-a75b-033941d3e6da
/data/local_slide.svsTo use DRS URIs, you must provide Gen3 credentials:
nextflow run main.nf \
--input samplesheet.csv \
--gen3_credentials ~/.gen3/credentials.json \
-profile dockerOME-TIFF files (.ome.tif, .ome.tiff) are automatically detected and converted to OpenSlide-compatible pyramidal TIFFs using VIPS. The conversion preserves the physical pixel size (mpp) from the OME-XML metadata.
| Parameter | Default | Description |
|---|---|---|
--input |
required | Path to samplesheet CSV |
--outdir |
results |
Output directory |
| Parameter | Default | Description |
|---|---|---|
--models |
['conic'] |
Model(s) to run. Single model: --models conic or multiple: --models conic,consep (runs each model on all slides) |
Available bundled models: conic, consep
| Parameter | Default | Description |
|---|---|---|
--roi_geojson |
null | Path to ROI GeoJSON file (applied to all samples) |
GrandQC models are pre-bundled in the container.
| Parameter | Default | Description |
|---|---|---|
--tissue_detection_model_path |
(bundled) | Path to GrandQC tissue model |
--artefact_detection_model_path |
(bundled) | Path to GrandQC artefact model |
--filter_artefacts |
false | Enable artefact detection and filter cells in artefact regions (memory-intensive on large slides) |
| Parameter | Default | Description |
|---|---|---|
--batch_size |
8 | Inference batch size |
--device |
null | Device (cuda:0, mps, cpu) |
--bf16 |
false | Use bfloat16 inference |
--tta |
false | Enable test-time augmentation |
| Parameter | Default | Description |
|---|---|---|
--tile_size |
1024 | Tile size in pixels |
--overlap |
64 | Tile overlap in pixels |
| Parameter | Default | Description |
|---|---|---|
--output_type |
csv spatialdata |
Output formats: csv (density stats), spatialdata (Zarr) |
| Parameter | Default | Description |
|---|---|---|
--vips_compression |
jpeg |
TIFF compression: jpeg, deflate, lzw, none |
| Parameter | Default | Description |
|---|---|---|
--gen3_credentials |
null | Path to Gen3 credentials JSON file |
--gen3_profile |
htan |
Gen3 profile name |
--gen3_api_endpoint |
https://nci-crdc.datacommons.io |
Gen3 API endpoint |
| Profile | Description |
|---|---|
docker |
Run with Docker |
singularity |
Run with Singularity |
apptainer |
Run with Apptainer |
gpu |
Enable GPU support (combine with container profile) |
tower |
Resource settings for Seqera Platform (Tower) |
tower_gpu |
Tower with GPU acceleration |
tower_test |
Test profile for Tower using S3 samplesheet |
test |
Run with test configuration |
# Basic run with Docker
nextflow run main.nf --input samples.csv -profile docker
# GPU-accelerated run
nextflow run main.nf --input samples.csv -profile docker,gpu
# Run with DRS URIs from NCI CRDC
nextflow run main.nf \
--input drs_samples.csv \
--gen3_credentials ~/.gen3/credentials.json \
-profile docker,gpu
# Run with consep model
nextflow run main.nf \
--input samples.csv \
--models consep \
-profile singularity,gpu
# Run multiple models (matrix)
nextflow run main.nf \
--input samples.csv \
--models conic,consep \
-profile docker,gpu
# Test profile
nextflow run main.nf -profile test,dockerThe pipeline produces the following outputs for each sample and model combination:
Results are organized as: {outdir}/{sample_id}/{model}/
| File | Description |
|---|---|
{sample_id}_{model}_cell_contours.geojson |
Cell contour polygons |
{sample_id}_{model}_cell_centroids.geojson |
Cell centroid points |
{sample_id}_{model}_tissue_contours.geojson |
Tissue contours (if tissue detection enabled) |
{sample_id}_{model}_artefact_contours.geojson |
Artefact contours (if artefact detection enabled) |
{sample_id}_{model}_cell_densities.csv |
Cell density statistics (if --output_type csv) |
{sample_id}_{model}_spatialdata.zarr |
SpatialData object (if --output_type spatialdata) |
When running multiple models, each model's results are stored in separate subdirectories.
# Build main classpose container
docker build -t ghcr.io/adamjtaylor/nf-classpose:latest docker/
# Build VIPS conversion container
docker build -t ghcr.io/adamjtaylor/nf-classpose-vips:main -f docker/Dockerfile.vips docker/
# Build Gen3 client container (requires amd64 for gen3-client binary)
docker build --platform linux/amd64 -t ghcr.io/adamjtaylor/nf-classpose-gen3:latest -f docker/Dockerfile.gen3 docker/MIT