Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit f7f2bf4

Browse files
aaroeykarmel
authored andcommitted
Fix TensorRT test output and add int8 result. (tensorflow#4004)
* Fix TensorRT test output and add int8 result. * Fix comments * Removing footnote declaration as well
1 parent e8863ed commit f7f2bf4

File tree

1 file changed

+22
-27
lines changed

1 file changed

+22
-27
lines changed

research/tensorrt/README.md

Lines changed: 22 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,10 @@ Here we provide a sample script that can:
99
1. Convert a TensorFlow SavedModel to a Frozen Graph.
1010
2. Load a Frozen Graph for inference.
1111
3. Time inference loops using the native TensorFlow graph.
12-
4. Time inference loops using FP32, FP16, or INT8<sup>1</sup> precision modes from TensorRT.
12+
4. Time inference loops using FP32, FP16, or INT8 precision modes from TensorRT.
1313

1414
We provide some results below, as well as instructions for running this script.
1515

16-
<sup>1</sup> INT8 mode is a work in progress; please see [INT8 Mode is the Bleeding Edge](#int8-mode-is-the-bleeding-edge) below.
17-
1816
## How to Run This Script
1917

2018
### Step 1: Install Prerequisites
@@ -63,41 +61,46 @@ you would run:
6361

6462
```
6563
python tensorrt.py --frozen_graph=resnetv2_imagenet_frozen_graph.pb \
66-
--image_file=image.jpg --native --fp32 --fp16 --output_dir=/my/output
64+
--image_file=image.jpg --native --fp32 --fp16 --int8 --output_dir=/my/output
6765
```
6866

6967
This will print the predictions for each of the precision modes that were run
7068
(native, which is the native precision of the model passed in, as well
71-
as the TensorRT version of the graph at precisions of fp32 and fp16):
69+
as the TensorRT version of the graph at precisions of fp32, fp16 and int8):
7270

7371
```
7472
INFO:tensorflow:Starting timing.
7573
INFO:tensorflow:Timing loop done!
7674
Predictions:
7775
Precision: native [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus']
7876
Precision: FP32 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
79-
Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'lakeside, lakeshore', u'sandbar, sand bar', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty']
77+
Precision: FP16 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'lakeside, lakeshore', u'sandbar, sand bar']
78+
Precision: INT8 [u'seashore, coast, seacoast, sea-coast', u'promontory, headland, head, foreland', u'breakwater, groin, groyne, mole, bulwark, seawall, jetty', u'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus', u'lakeside, lakeshore']
8079
```
8180

8281
The script will generate or append to a file in the output_dir, `log.txt`,
8382
which includes the timing information for each of the models:
8483

8584
```
8685
==========================
87-
network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
88-
fps median: 1041.4, mean: 1056.6, uncertainty: 2.8, jitter: 6.1
89-
latency median: 0.12292, mean: 0.12123, 99th_p: 0.13151, 99th_uncertainty: 0.00024
86+
network: native_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
87+
fps median: 468.2, mean: 469.0, uncertainty: 0.3, jitter: 1.6
88+
latency median: 0.27336, mean: 0.27290, 99th_p: 0.27475, 99th_uncertainty: 0.00027
9089
9190
==========================
92-
network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
93-
fps median: 1253.0, mean: 1250.8, uncertainty: 3.4, jitter: 17.3
94-
latency median: 0.10215, mean: 0.10241, 99th_p: 0.11482, 99th_uncertainty: 0.01109
91+
network: tftrt_fp32_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
92+
fps median: 627.7, mean: 628.9, uncertainty: 0.5, jitter: 3.6
93+
latency median: 0.20392, mean: 0.20354, 99th_p: 0.20608, 99th_uncertainty: 0.00083
9594
9695
==========================
97-
network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
98-
fps median: 2280.2, mean: 2312.8, uncertainty: 10.3, jitter: 100.1
99-
latency median: 0.05614, mean: 0.05546, 99th_p: 0.06103, 99th_uncertainty: 0.00781
96+
network: tftrt_fp16_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
97+
fps median: 626.8, mean: 628.8, uncertainty: 0.5, jitter: 3.1
98+
latency median: 0.20421, mean: 0.20359, 99th_p: 0.20555, 99th_uncertainty: 0.00019
10099
100+
==========================
101+
network: tftrt_int8_resnetv2_imagenet_frozen_graph.pb, batchsize 128, steps 100
102+
fps median: 1362.4, mean: 1368.1, uncertainty: 2.2, jitter: 14.4
103+
latency median: 0.09396, mean: 0.09359, 99th_p: 0.09546, 99th_uncertainty: 0.00021
101104
```
102105

103106
The script will also output the GraphDefs used for each of the modes run,
@@ -106,22 +109,14 @@ for future use and inspection:
106109
```
107110
ls /my/output
108111
log.txt
109-
tftrt_fp16_imagenet_frozen_graph.pb
110-
tftrt_fp32_imagenet_frozen_graph.pb
112+
tftrt_fp16_resnetv2_imagenet_frozen_graph.pb
113+
tftrt_fp32_resnetv2_imagenet_frozen_graph.pb
114+
tftrt_int8_calib_resnetv2_imagenet_frozen_graph.pb
115+
tftrt_int8_resnetv2_imagenet_frozen_graph.pb
111116
```
112117

113118
## Troubleshooting and Notes
114119

115-
### INT8 Mode is the Bleeding Edge
116-
117-
Note that currently, INT8 mode results in a segfault using the models provided.
118-
We are working on it.
119-
120-
```
121-
E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: Network.cpp::addScale::118, condition: shift.count == 0 || shift.count == weightCount
122-
Segmentation fault (core dumped)
123-
```
124-
125120
### GPU/Precision Compatibility
126121

127122
Not all GPUs support the ops required for all precisions. For example, P100s

0 commit comments

Comments
 (0)