This is a comprehensive Python benchmark suite to run perf runs using different supported backends. Following backends are supported:
- Torch
- Torch-TensorRT [Torchscript]
- Torch-TensorRT [Dynamo]
- Torch-TensorRT [torch_compile]
- TensorRT
Note: Please note that for ONNX models, user can convert the ONNX model to TensorRT serialized engine and then use this package.
Benchmark scripts depends on following Python packages in addition to requirements.txt packages
- Torch-TensorRT
- Torch
- TensorRT
./
├── models
├── perf_run.py
├── hub.py
├── custom_models.py
├── requirements.txt
├── benchmark.sh
└── README.md
models- Model directoryperf_run.py- Performance benchmarking script which supports torch, ts_trt, torch_compile, dynamo, tensorrt backendshub.py- Script to download torchscript models for VGG16, Resnet50, EfficientNet-B0, VIT, HF-BERTcustom_models.py- Script which includes custom models other than torchvision and timm (eg: HF BERT)utils.py- utility functions scriptbenchmark.sh- This is used for internal performance testing of VGG16, Resnet50, EfficientNet-B0, VIT, HF-BERT.
Here are the list of CompileSpec options that can be provided directly to compile the pytorch module
--backends: Comma separated string of backends. Eg: torch, torch_compile, dynamo, tensorrt--model: Name of the model file (Can be a torchscript module or a tensorrt engine (ending in.planextension)). If the backend isdynamoortorch_compile, the input should be a Pytorch module (instead of a torchscript module).--model_torch: Name of the PyTorch model file (optional, only necessary ifdynamoortorch_compileis a chosen backend)--inputs: List of input shapes & dtypes. Eg: (1, 3, 224, 224)@fp32 for Resnet or (1, 128)@int32;(1, 128)@int32 for BERT--batch_size: Batch size--precision: Comma separated list of precisions to build TensorRT engine Eg: fp32,fp16--device: Device ID--truncate: Truncate long and double weights in the network in Torch-TensorRT--is_trt_engine: Boolean flag to be enabled if the model file provided is a TensorRT engine.--report: Path of the output file where performance summary is written.
Eg:
python perf_run.py --model ${MODELS_DIR}/vgg16_scripted.jit.pt \
--model_torch ${MODELS_DIR}/vgg16_torch.pt \
--precision fp32,fp16 --inputs="(1, 3, 224, 224)@fp32" \
--batch_size 1 \
--backends torch,ts_trt,dynamo,torch_compile,tensorrt \
--report "vgg_perf_bs1.txt"
Note:
- Please note that measuring INT8 performance is only supported via a
calibration cachefile or QAT mode fortorch_tensorrtbackend. - TensorRT engine filename should end with
.planotherwise it will be treated as Torchscript module.
This tool benchmarks any pytorch model or torchscript module. As an example, we provide VGG16, Resnet50, EfficientNet-B0, VIT, HF-BERT models in hub.py that we internally test for performance.
The torchscript modules for these models can be generated by running
python hub.py
You can refer to benchmark.sh on how we run/benchmark these models.