WACV 2026
This is the official repository of our WACV 2026 paper:
Song, L.*, Bishnoi, H.*, Manne, S.K.R., Ostadabbas, S., Taylor, B.J., Wan, M., "Overcoming Small Data Limitations in Video-Based Infant Respiration Estimation" (*equal contribution). 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). [arXiv link]
Here we provide our model code, training checkpoints, and annotated dataset to support automatic estimation of infant respiration waveforms and respiration rate from natural video footage, with the help of spatiotemporal computer vision models and infant-specific region-of-interest tracking.
Sample Dataset Preprocessing
- Requirements & Setup
- Quickstart: Inference
- Annotated Infant Respiration Dataset (AIR-400)
- Reproducing Paper Results
- Citation
- License
conda env create -f environment.yml2. Compile pyflow library and import it as a module
git clone https://github.com/pathak22/pyflow.git
(cd pyflow && python setup.py build_ext -i && mv pyflow.cpython-*.so ..)Sample Inference Output
- Download a trained model and ROI detector files. Download our demo video, or provide your own as input.
- Fill the
DATA_PATHfields of config YAML inconfigs/inferencefolder.- Set path for output directory.
- Set valid detector paths (YOLO weights) if ROI cropping is enabled. Otherwise, set
DO_CROP_INFANT_REGION: False. - Set input video file or video folder path.
DATA_PATH:
OUTPUT_DIR: /absolute/path/to/output_dir/
BODY_DETECTOR_PATH: /absolute/path/to/yolov8m.pt
FACE_DETECTOR_PATH: /absolute/path/to/yolov8n-face.pt
# Provide exactly one of the following:
VIDEO_FILE: /absolute/path/to/video.mp4
# VIDEO_DIR: /absolute/path/to/videos/Use run_infer.sh to preprocess input video(s) and run a trained model for respiration rate estimation. Specify required config YAML file path and model checkpoint file path in run_infer.sh.
Example run:
./run_infer.sh- Per-video JSON under
OUTPUT_DIR/inference/{video}_{datetime}with prediction result JSON file and generated artifacts (HDF5 format time series and PNG format waveform plots). - A summary JSON across all processed videos (
summary_{datetime}.json). - Logs saved under
OUTPUT_DIR/logs/.
The AIR-400 dataset consists of two parts:
-
AIR-125 β original dataset (125 videos from 8 subjects, labeled S01 through S08, with S06, S07, and 08 provided as public web links)
-
AIR-400 β expanded dataset (275 videos from 10 additional subjects from the same study, labeled S01 through S10, but not the same as the ones from AIR-125)
Each subject directory contains synchronized video files (.mp4) and breathing signal annotations (.hdf5).
In the AIR_125 folder, each subject directory (S01, S02, ... S08) includes paired video and annotation files:
AIR_125/
S01/
β-- 001.mp4
β-- 001.hdf5
β-- 002.mp4
β-- 002.hdf5
β ...
β-- n.mp4
β-- n.hdf5
β
S02/
β-- 001.mp4
β-- 001.hdf5
β ...
...
In the AIR_400 folder, annotation files are stored separately inside each subject's out/ directory:
AIR_400/
S01/
β-- 001.mp4
β-- 002.mp4
β-- 003.mp4
β ...
β-- n.mp4
β
β-- out/
β β-- 001.hdf5
β β-- 002.hdf5
β β-- 003.hdf5
β β ...
β β-- n.hdf5
β
S02/
β-- 001.mp4
β-- ...
β-- out/
β β-- 001.hdf5
β ...
...
export WANDB_API_KEY=<your_api_key>
wandb loginSet USE_WANDB: True in YAML file.
2. Download AIR-400 dataset and ROI detector files.
DATA_PATH:
AIR_125: [air-125-dir-path]
AIR_400: [air-400-dir-path]
COHFACE: [cohface-dir-path]
CACHE_DIR: [your-cache-dir]
OUTPUT_DIR: [your-output-dir]
BODY_DETECTOR_PATH: [yolov8-path]
FACE_DETECTOR_PATH: [yolov8-face-path]Specify required config YAML file path in run.sh. Then uncomment --preprocess after python main.py --config "$CONFIG" to enable preprocess-only mode. Run this approach first to make sure dataset is preprocessed correctly before following training and testing.
./run.shComment out --peprocess after python main.py --config "$CONFIG" in run.sh to start training and testing process.
./run.sh@inproceedings{song_bishnoi_overcoming_2026,
booktitle = {2026 {IEEE}/{CVF} {Winter} {Conference} on {Applications} of {Computer} {Vision} ({WACV})},
publisher = {IEEE},
title = {Overcoming {Small} {Data} {Limitations} in {Video}-{Based} {Infant} {Respiration} {Estimation}},
author = {Song, Liyang and Bishnoi, Hardik and Manne, Sai Kumar Reddy and Ostadabbas, Sarah and Taylor, Brianna J and Wan, Michael},
year = {2026},
}This project is licensed under the MIT License. See the LICENSE file for details.