@@ -9,12 +9,10 @@ Here we provide a sample script that can:
9
9
1 . Convert a TensorFlow SavedModel to a Frozen Graph.
10
10
2 . Load a Frozen Graph for inference.
11
11
3 . Time inference loops using the native TensorFlow graph.
12
- 4 . Time inference loops using FP32, FP16, or INT8< sup >1</ sup > precision modes from TensorRT.
12
+ 4 . Time inference loops using FP32, FP16, or INT8 precision modes from TensorRT.
13
13
14
14
We provide some results below, as well as instructions for running this script.
15
15
16
- <sup >1</sup > INT8 mode is a work in progress; please see [ INT8 Mode is the Bleeding Edge] ( #int8-mode-is-the-bleeding-edge ) below.
17
-
18
16
## How to Run This Script
19
17
20
18
### Step 1: Install Prerequisites
@@ -63,41 +61,46 @@ you would run:
63
61
64
62
```
65
63
python tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb \
66
- --image_file=image.jpg --native --fp32 --fp16 --output_dir=/my/output
64
+ --image_file=image.jpg --native --fp32 --fp16 --int8 -- output_dir=/my/output
67
65
```
68
66
69
67
This will print the predictions for each of the precision modes that were run
70
68
(native, which is the native precision of the model passed in, as well
71
- as the TensorRT version of the graph at precisions of fp32 and fp16 ):
69
+ as the TensorRT version of the graph at precisions of fp32, fp16 and int8 ):
72
70
73
71
```
74
72
INFO:tensorflow:Starting timing.
75
73
INFO:tensorflow:Timing loop done!
76
74
Predictions:
77
75
Precision: native [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus']
78
76
Precision: FP32 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
79
- Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'lakeside, lakeshore', u'sandbar, sand bar', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty']
77
+ Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
78
+ Precision: INT8 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus', u'lakeside, lakeshore']
80
79
```
81
80
82
81
The script will generate or append to a file in the output_dir, ` log.txt ` ,
83
82
which includes the timing information for each of the models:
84
83
85
84
```
86
85
==========================
87
- network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
88
- fps median: 1041.4, mean: 1056.6, uncertainty: 2.8, jitter: 6.1
89
- latency median: 0.12292, mean: 0.12123, 99th_p: 0.13151, 99th_uncertainty: 0.00024
86
+ network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
87
+ fps median: 468.2, mean: 469.0, uncertainty: 0.3, jitter: 1.6
88
+ latency median: 0.27336, mean: 0.27290, 99th_p: 0.27475, 99th_uncertainty: 0.00027
90
89
91
90
==========================
92
- network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
93
- fps median: 1253.0, mean: 1250.8, uncertainty: 3.4, jitter: 17.3
94
- latency median: 0.10215, mean: 0.10241, 99th_p: 0.11482, 99th_uncertainty: 0.01109
91
+ network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
92
+ fps median: 627.7, mean: 628.9, uncertainty: 0.5, jitter: 3.6
93
+ latency median: 0.20392, mean: 0.20354, 99th_p: 0.20608, 99th_uncertainty: 0.00083
95
94
96
95
==========================
97
- network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
98
- fps median: 2280.2, mean: 2312 .8, uncertainty: 10.3, jitter: 100 .1
99
- latency median: 0.05614, mean: 0.05546, 99th_p: 0.06103, 99th_uncertainty: 0.00781
96
+ network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
97
+ fps median: 626.8, mean: 628 .8, uncertainty: 0.5, jitter: 3 .1
98
+ latency median: 0.20421, mean: 0.20359, 99th_p: 0.20555, 99th_uncertainty: 0.00019
100
99
100
+ ==========================
101
+ network: tftrt_int8_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
102
+ fps median: 1362.4, mean: 1368.1, uncertainty: 2.2, jitter: 14.4
103
+ latency median: 0.09396, mean: 0.09359, 99th_p: 0.09546, 99th_uncertainty: 0.00021
101
104
```
102
105
103
106
The script will also output the GraphDefs used for each of the modes run,
@@ -106,22 +109,14 @@ for future use and inspection:
106
109
```
107
110
ls /my/output
108
111
log.txt
109
- tftrt_fp16_imagenet_frozen_graph.pb
110
- tftrt_fp32_imagenet_frozen_graph.pb
112
+ tftrt_fp16_resnetv2_imagenet_frozen_graph.pb
113
+ tftrt_fp32_resnetv2_imagenet_frozen_graph.pb
114
+ tftrt_int8_calib_resnetv2_imagenet_frozen_graph.pb
115
+ tftrt_int8_resnetv2_imagenet_frozen_graph.pb
111
116
```
112
117
113
118
## Troubleshooting and Notes
114
119
115
- ### INT8 Mode is the Bleeding Edge
116
-
117
- Note that currently, INT8 mode results in a segfault using the models provided.
118
- We are working on it.
119
-
120
- ```
121
- E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: Network.cpp::addScale::118, condition: shift.count == 0 || shift.count == weightCount
122
- Segmentation fault (core dumped)
123
- ```
124
-
125
120
### GPU/Precision Compatibility
126
121
127
122
Not all GPUs support the ops required for all precisions. For example, P100s
0 commit comments