VOCSeg performs segmentation of voc2012 by utilizing an FCN based model.
VOCSeg model is firstly cloned from KittiSeg model
Based on KittiSeg model, we make some modifications for multi-class segmentation.
KittiSeg model which is achieved first place on the Kitti Road Detection Benchmark at submission time. Check out their paper for a detailed model description.
The repository contains code for training, evaluating and visualizing semantic segmentation in TensorFlow. It is build to be compatible with the TensorVision back end which allows to organize experiments in a very clean way.
The code requires Tensorflow 1.0 as well as the following python libraries:
- matplotlib
- numpy
- Pillow
- scipy
- commentjson
Those modules can be installed using: pip install numpy scipy pillow matplotlib commentjson or pip install -r requirements.txt.
- Clone this repository:
git clone https://github.com/lxh-123/VOCSeg.git - Initialize all submodules:
git submodule update --init --recursive - [Optional] Download VOC2012 Data:
- Retrieve voc data url here: baidu.yun
- Extract it. the file/folder will be like:
JPEGImages folder,SegmentationClass folder,train.lst,val.lst,test.lst - You can download the
VOC2012.rarbypython download_data.py --voc_url URL_YOU_RETRIEVEDwith URL_YOU_RETRIEVED
The training image number is only : 2913, and the Validation image number is: 2906
Step 3 is only required if you want to train your own model using train.py or bench a model agains the official evaluation score evaluate.py. Also note, that I recommend using download_data.py instead of downloading the data yourself. The script will also extract and prepare the data. See Section Manage data storage if you like to control where the data is stored.
- Pull all patches:
git pull - Update all submodules:
git submodule update --init --recursive
If you forget the second step you might end up with an inconstant repository state.
Run: python evaluate.py to evaluate a trained model.
Run: python train.py --hypes hypes/VOCSeg.json to train a model using voc2012 Data.
VOCSeg allows to separate data storage from code. This is very useful in many server environments. By default, the data is stored in the folder VOCSeg/DATA and the output of runs in VOCSeg/RUNS. This behaviour can be changed by setting the bash environment variables: $TV_DIR_DATA and $TV_DIR_RUNS.
Include export TV_DIR_DATA="/MY/LARGE/HDD/DATA" in your .profile and the all data will be downloaded to /MY/LARGE/HDD/DATA/data_road. Include export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS" in your .profile and all runs will be saved to /MY/LARGE/HDD/RUNS/VOCSeg
VOCSeg helps you to organize large number of experiments. To do so the output of each run is stored in its own rundir. Each rundir contains:
output.loga copy of the training output which was printed to your screentensorflow eventstensorboard can be run in rundirtensorflow checkpointsthe trained model can be loaded from rundir[dir] imagesa folder containing example output images.image_itercontrols how often the whole validation set is dumped[dir] model_filesA copy of all source code need to build the model. This can be very useful of you have many versions of the model.
To keep track of all the experiments, you can give each rundir a unique name with the --name flag. The --project flag will store the run in a separate subfolder allowing to run different series of experiments. As an example, python train.py --project batch_size_bench --name size_5 will use the following dir as rundir: $TV_DIR_RUNS/VOCSeg/batch_size_bench/size_5_VOCSeg_2017_02_08_13.12.
The flag --nosave is very useful to not spam your rundir.
The model is controlled by the file hypes/VOCSeg.json. Modifying this file should be enough to train the model on your own data and adjust the architecture according to your needs. A description of the expected input format can be found here.
For advanced modifications, the code is controlled by 5 different modules, which are specified in hypes/VOCSeg.json.
"model": {
"input_file": "../inputs/voc_seg_input.py",
"architecture_file" : "../encoder/fcn8_vgg.py",
"objective_file" : "../decoder/voc_multiloss.py",
"optimizer_file" : "../optimizer/generic_optimizer.py",
"evaluator_file" : "../evals/voc_eval.py"
},
Those modules operate independently. This allows easy experiments with different datasets (input_file), encoder networks (architecture_file), etc. Also see TensorVision for a specification of each of those files.
VOCSeg is build on top of the TensorVision TensorVision backend. TensorVision modularizes computer vision training and helps organizing experiments.
To utilize the entire TensorVision functionality install it using
$ cd VOCSeg/submodules/TensorVision
$ python setup install
Now you can use the TensorVision command line tools, which includes:
tv-train --hypes hypes/VOCSeg.json trains a json model.
tv-continue --logdir PATH/TO/RUNDIR trains the model in RUNDIR, starting from the last saved checkpoint. Can be used for fine tuning by increasing max_steps in model_files/hypes.json .
tv-analyze --logdir PATH/TO/RUNDIR evaluates the model in RUNDIR
Here are some Flags which will be useful when working with VOCSeg and TensorVision. All flags are available across all scripts.
--hypes : specify which hype-file to use
--logdir : specify which logdir to use
--gpus : specify on which GPUs to run the code
--name : assign a name to the run
--project : assign a project to the run
--nosave : debug run, logdir will be set to debug
In addition the following TensorVision environment Variables will be useful:
$TV_DIR_DATA: specify meta directory for data
$TV_DIR_RUNS: specify meta directory for output
$TV_USE_GPUS: specify default GPU behaviour.
On a cluster it is useful to set $TV_USE_GPUS=force. This will make the flag --gpus mandatory and ensure, that run will be executed on the right GPU.
Please have a look into the FAQ. Also feel free to open an issue to discuss any questions not covered so far.
If you benefit from this code, please cite the original paper:
@article{teichmann2016multinet,
title={MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving},
author={Teichmann, Marvin and Weber, Michael and Zoellner, Marius and Cipolla, Roberto and Urtasun, Raquel},
journal={arXiv preprint arXiv:1612.07695},
year={2016}
}