Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit c2575d1

Browse files
mingfeimaSvetlana Karslioglu
andauthored
Update memory_format_tutorial.py (#1925)
* Update memory_format_tutorial.py Add perf gain data for CPU Channels last part. * Update memory_format_tutorial.py specify that channels last applies to both cpu and gpu. Co-authored-by: Svetlana Karslioglu <[email protected]>
1 parent bc11e8f commit c2575d1

1 file changed

Lines changed: 7 additions & 1 deletion

File tree

intermediate_source/memory_format_tutorial.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,8 @@
151151
######################################################################
152152
# Performance Gains
153153
# --------------------------------------------------------------------
154-
# The most significant performance gains are observed on NVidia's
154+
# Channels last memory format optimizations are available on both GPU and CPU.
155+
# On GPU, the most significant performance gains are observed on NVidia's
155156
# hardware with Tensor Cores support running on reduced precision
156157
# (``torch.float16``).
157158
# We were able to archive over 22% perf gains with channels last
@@ -240,6 +241,11 @@
240241
# ``alexnet``, ``mnasnet0_5``, ``mnasnet0_75``, ``mnasnet1_0``, ``mnasnet1_3``, ``mobilenet_v2``, ``resnet101``, ``resnet152``, ``resnet18``, ``resnet34``, ``resnet50``, ``resnext50_32x4d``, ``shufflenet_v2_x0_5``, ``shufflenet_v2_x1_0``, ``shufflenet_v2_x1_5``, ``shufflenet_v2_x2_0``, ``squeezenet1_0``, ``squeezenet1_1``, ``vgg11``, ``vgg11_bn``, ``vgg13``, ``vgg13_bn``, ``vgg16``, ``vgg16_bn``, ``vgg19``, ``vgg19_bn``, ``wide_resnet101_2``, ``wide_resnet50_2``
241242
#
242243

244+
######################################################################
245+
# The following list of models has the full support of Channels last and showing 26%-76% perf gains on Intel(R) Xeon(R) Ice Lake (or newer) CPUs:
246+
# ``alexnet``, ``densenet121``, ``densenet161``, ``densenet169``, ``googlenet``, ``inception_v3``, ``mnasnet0_5``, ``mnasnet1_0``, ``resnet101``, ``resnet152``, ``resnet18``, ``resnet34``, ``resnet50``, ``resnext101_32x8d``, ``resnext50_32x4d``, ``shufflenet_v2_x0_5``, ``shufflenet_v2_x1_0``, ``squeezenet1_0``, ``squeezenet1_1``, ``vgg11``, ``vgg11_bn``, ``vgg13``, ``vgg13_bn``, ``vgg16``, ``vgg16_bn``, ``vgg19``, ``vgg19_bn``, ``wide_resnet101_2``, ``wide_resnet50_2``
247+
#
248+
243249
######################################################################
244250
# Converting existing models
245251
# --------------------------

0 commit comments

Comments
 (0)