The config file includes data path, optimizer, scheduler, etc, ...
In each configure file:
- stages/data_params/root: To the folder where stores image data.
- image_size: determine the size of image
Note:
You do not need to change: train_csv and valid_csv because they are overrided by running bash file bellow.
The following data is used for different models.
-
3 windows (3w) data:
python src/preprocessing.py extract-images --inputdir <kaggle_input_dir> --outputdir <output_folder>
-
3 windows (3w) with crop data:
python src/preprocessing_3w.py extract-images --inputdir <kaggle_input_dir> --outputdir <output_folder>
-
3d data:
python src/preprocessing2.py
-
Start docker:
make run make exec cd /kaggle-rsna/
-
Train
resnet18, resnet34, resnet50, alexnetwith3 windows (3w)setting:bash bin/train_bac_3w.sh
Note: normalize=True
-
Train
resnet50with3dsetting:bash bin/train_bac_3d.sh
Note: normalize=False
-
Train
densenet169with3 windows and cropsetting:bash bin/train_toan.sh bash bin/train_toan_resume.sh
Note: normalize=True
where:
- CUDA_VISIBLE_DEVICES: GPUs number required to train.
- LOGDIR: Output folder which stores the checkpoints, logs, etc.
- model_name: the name of model to be trained. The script supports the name of model in here
- It is better to create a
wandbaccount, it will help you track your log, backup the code, store the checkpoints on the could in real-time. If you dont want to usewandb, please set:WANDB=0
Output:
The best checkpoint is saved at: ${LOGDIR}/${log_name}/checkpoints/best.pth.
python src/inference.pyCheck function predict_test_tta_ckp for more information, you may want to change the path, the name of model and the output path.
For 3d setting, normalization=False, otherwise normalization=True
In src/ensemble.py, you should change the prediction path of each fold of model and the name of output ensemble.
python src/ensemble.py