You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ users as quickly as possible.
12
12
13
13
## 1. Mixed Precision
14
14
15
-
### amp: Automatic Mixed Precision
15
+
### Amp: Automatic Mixed Precision
16
16
17
17
`apex.amp` is a tool to enable mixed precision training by changing only 3 lines of your script.
18
18
Users can easily experiment with different pure and mixed precision training modes by supplying
@@ -27,7 +27,7 @@ different flags to `amp.initialize`.
27
27
28
28
[DCGAN example coming soon...](https://github.com/NVIDIA/apex/tree/master/examples/dcgan)
29
29
30
-
[Moving to the new Amp API] (for users of the deprecated tools formerly called "Amp" and "FP16_Optimizer")
30
+
[Moving to the new Amp API](https://nvidia.github.io/apex/amp.html#transition-guide-for-old-api-users) (for users of the deprecated tools formerly called "Amp" and "FP16_Optimizer")
Amp allows users to easily experiment with different pure and mixed precision modes, including
46
-
pure FP16 training and pure FP32 training. Commonly-used default modes are chosen by
47
-
selecting an "optimization level" or ``opt_level``; each ``opt_level`` establishes a set of
48
-
properties that govern Amp's implementation of pure or mixed precision training.
49
-
Finer-grained control of how a given ``opt_level`` behaves can be achieved by passing values for
50
-
particular properties directly to ``amp.initialize``. These manually specified values will
51
-
override the defaults established by the ``opt_level``.
52
-
53
58
Properties
54
59
**********
55
60
56
61
Currently, the under-the-hood properties that govern pure or mixed precision training are the following:
57
62
58
63
- ``cast_model_type``: Casts your model's parameters and buffers to the desired type.
59
64
- ``patch_torch_functions``: Patch all Torch functions and Tensor methods to perform Tensor Core-friendly ops like GEMMs and convolutions in FP16, and any ops that benefit from FP32 precision in FP32.
60
-
- ``keep_batchnorm_fp32``: To enhance precision and enable cudnn batchnorm (which improves performance), it's often beneficial to keep batchnorms in particular in FP32 even if the rest of the model is FP16.
65
+
- ``keep_batchnorm_fp32``: To enhance precision and enable cudnn batchnorm (which improves performance), it's often beneficial to keep batchnorm weights in FP32 even if the rest of the model is FP16.
61
66
- ``master_weights``: Maintain FP32 master weights to accompany any FP16 model weights. FP32 master weights are stepped by the optimizer to enhance precision and capture small gradients.
62
-
- ``loss_scale``: If ``loss_scale`` is a float value, use this value as the static (fixed) loss scale. If ``loss_scale`` is the string ``"dynamic"``, adapatively adjust the loss scale over time. Dynamic loss scale adjustments are performed by Amp automatically.
67
+
- ``loss_scale``: If ``loss_scale`` is a float value, use this value as the static (fixed) loss scale. If ``loss_scale`` is the string ``"dynamic"``, adaptively adjust the loss scale over time. Dynamic loss scale adjustments are performed by Amp automatically.
63
68
64
69
Again, you often don't need to specify these properties by hand. Instead, select an ``opt_level``,
65
70
which will set them up for you. After selecting an ``opt_level``, you can optionally pass property
@@ -85,7 +90,7 @@ Your incoming model should be FP32 already, so this is likely a no-op.
0 commit comments