You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
.. note:: Quantization in PyTorch 2.0 export is still a work in progress.
7
-
8
6
Prerequisites:
9
7
^^^^^^^^^^^^^^^^
10
8
@@ -14,7 +12,7 @@ Required:
14
12
15
13
- `Quantization concepts in PyTorch <https://pytorch.org/docs/master/quantization.html#quantization-api-summary>`__
16
14
17
-
- `(prototype) PyTorch 2.0 Export Post Training Static Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`__
15
+
- `(prototype) PyTorch 2 Export Post Training Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html>`__
18
16
19
17
Optional:
20
18
@@ -27,11 +25,11 @@ Optional:
27
25
Introduction
28
26
^^^^^^^^^^^^^
29
27
30
-
`(prototype) PyTorch 2.0 Export Post Training Static Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`__ introduced the overall API for pytorch 2.0 export quantization, main difference from fx graph mode quantization in terms of API is that we made it explicit that quantiation is targeting a specific backend. So to use the new flow, backend need to implement a ``Quantizer`` class that encodes:
28
+
`(prototype) PyTorch 2 Export Post Training Quantization <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html>`__ introduced the overall API for pytorch 2 export quantization, main difference from fx graph mode quantization in terms of API is that we made it explicit that quantiation is targeting a specific backend. So to use the new flow, backend need to implement a ``Quantizer`` class that encodes:
31
29
(1). What is supported quantized operator or patterns in the backend
32
30
(2). How can users express the way they want their floating point model to be quantized, for example, quantized the whole model to be int8 symmetric quantization, or quantize only linear layers etc.
33
31
34
-
Please see `here <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html#motivation-of-pytorch-2-0-export-quantization>`__ For motivations for the new API and ``Quantizer``.
32
+
Please see `here <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html#motivation-of-pytorch-2-export-quantization>`__ For motivations for the new API and ``Quantizer``.
35
33
36
34
An existing quantizer object defined for ``XNNPACK`` is in
@@ -307,7 +305,7 @@ functions that are used in the example:
307
305
Conclusion
308
306
^^^^^^^^^^^^^^^^^^^
309
307
310
-
With this tutorial, we introduce the new quantization path in PyTorch 2.0. Users can learn about
311
-
how to define a ``BackendQuantizer`` with the ``QuantizationAnnotation API`` and integrate it into the quantization 2.0 flow.
308
+
With this tutorial, we introduce the new quantization path in PyTorch 2. Users can learn about
309
+
how to define a ``BackendQuantizer`` with the ``QuantizationAnnotation API`` and integrate it into the PyTorch 2 Export Quantization flow.
312
310
Examples of ``QuantizationSpec``, ``SharedQuantizationSpec``, ``FixedQParamsQuantizationSpec``, and ``DerivedQuantizationSpec``
313
-
are given for specific annotation use case. This is a prerequisite to be able to quantize a model in PyTorch 2.0 Export Quantization flow. You can use `XNNPACKQuantizer <https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/quantizer/xnnpack_quantizer.py>`_ as an example to start implementing your own ``Quantizer``. After that please follow `this tutorial <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq_static.html>`_ to actually quantize your model.
311
+
are given for specific annotation use case. You can use `XNNPACKQuantizer <https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/quantizer/xnnpack_quantizer.py>`_ as an example to start implementing your own ``Quantizer``. After that please follow `this tutorial <https://pytorch.org/tutorials/prototype/pt2e_quant_ptq.html>`_ to actually quantize your model.
0 commit comments