Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit cec1e9d

Browse files
gpsheadtiranmdickinson
authored
[3.9] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96502)
* Correctly pre-check for int-to-str conversion (#96537) Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =) The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact. The justification for the current check. The C code check is: ```c max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10 ``` In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is: $$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$ From this it follows that $$\frac{M}{3L} < \frac{s-1}{10}$$ hence that $$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$ So $$2^{L(s-1)} > 10^M.$$ But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check. <!-- gh-issue-number: gh-95778 --> * Issue: gh-95778 <!-- /gh-issue-number --> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]> Co-authored-by: Christian Heimes <[email protected]> Co-authored-by: Mark Dickinson <[email protected]>
1 parent d348afa commit cec1e9d

27 files changed

+886
-19
lines changed

Doc/data/python3.9.abi

+4-1
Original file line numberDiff line numberDiff line change
@@ -5653,7 +5653,7 @@
56535653
<var-decl name='id' type-id='type-id-238' visibility='default' filepath='./Include/cpython/pystate.h' line='137' column='1'/>
56545654
</data-member>
56555655
</class-decl>
5656-
<class-decl name='_is' size-in-bits='45184' is-struct='yes' visibility='default' filepath='./Include/internal/pycore_interp.h' line='71' column='1' id='type-id-313'>
5656+
<class-decl name='_is' size-in-bits='45248' is-struct='yes' visibility='default' filepath='./Include/internal/pycore_interp.h' line='71' column='1' id='type-id-313'>
56575657
<data-member access='public' layout-offset-in-bits='0'>
56585658
<var-decl name='next' type-id='type-id-314' visibility='default' filepath='./Include/internal/pycore_interp.h' line='73' column='1'/>
56595659
</data-member>
@@ -5774,6 +5774,9 @@
57745774
<data-member access='public' layout-offset-in-bits='28416'>
57755775
<var-decl name='small_ints' type-id='type-id-326' visibility='default' filepath='./Include/internal/pycore_interp.h' line='155' column='1'/>
57765776
</data-member>
5777+
<data-member access='public' layout-offset-in-bits='45184'>
5778+
<var-decl name='int_max_str_digits' type-id='type-id-8' visibility='default' filepath='./Include/internal/pycore_interp.h' line='158' column='1'/>
5779+
</data-member>
57775780
</class-decl>
57785781
<pointer-type-def type-id='type-id-313' size-in-bits='64' id='type-id-314'/>
57795782
<class-decl name='pyruntimestate' size-in-bits='5248' is-struct='yes' visibility='default' filepath='./Include/internal/pycore_runtime.h' line='52' column='1' id='type-id-327'>

Doc/library/functions.rst

+8
Original file line numberDiff line numberDiff line change
@@ -844,6 +844,14 @@ are always available. They are listed here in alphabetical order.
844844
.. versionchanged:: 3.8
845845
Falls back to :meth:`__index__` if :meth:`__int__` is not defined.
846846

847+
.. versionchanged:: 3.9.14
848+
:class:`int` string inputs and string representations can be limited to
849+
help avoid denial of service attacks. A :exc:`ValueError` is raised when
850+
the limit is exceeded while converting a string *x* to an :class:`int` or
851+
when converting an :class:`int` into a string would exceed the limit.
852+
See the :ref:`integer string conversion length limitation
853+
<int_max_str_digits>` documentation.
854+
847855

848856
.. function:: isinstance(object, classinfo)
849857

Doc/library/json.rst

+11
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,11 @@ is a lightweight data interchange format inspired by
1818
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
1919
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).
2020

21+
.. warning::
22+
Be cautious when parsing JSON data from untrusted sources. A malicious
23+
JSON string may cause the decoder to consume considerable CPU and memory
24+
resources. Limiting the size of data to be parsed is recommended.
25+
2126
:mod:`json` exposes an API familiar to users of the standard library
2227
:mod:`marshal` and :mod:`pickle` modules.
2328

@@ -248,6 +253,12 @@ Basic Usage
248253
be used to use another datatype or parser for JSON integers
249254
(e.g. :class:`float`).
250255

256+
.. versionchanged:: 3.9.14
257+
The default *parse_int* of :func:`int` now limits the maximum length of
258+
the integer string via the interpreter's :ref:`integer string
259+
conversion length limitation <int_max_str_digits>` to help avoid denial
260+
of service attacks.
261+
251262
*parse_constant*, if specified, will be called with one of the following
252263
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
253264
This can be used to raise an exception if invalid JSON numbers

Doc/library/stdtypes.rst

+159
Original file line numberDiff line numberDiff line change
@@ -5244,6 +5244,165 @@ types, where they are relevant. Some of these are not reported by the
52445244
[<class 'bool'>]
52455245

52465246

5247+
.. _int_max_str_digits:
5248+
5249+
Integer string conversion length limitation
5250+
===========================================
5251+
5252+
CPython has a global limit for converting between :class:`int` and :class:`str`
5253+
to mitigate denial of service attacks. This limit *only* applies to decimal or
5254+
other non-power-of-two number bases. Hexadecimal, octal, and binary conversions
5255+
are unlimited. The limit can be configured.
5256+
5257+
The :class:`int` type in CPython is an abitrary length number stored in binary
5258+
form (commonly known as a "bignum"). There exists no algorithm that can convert
5259+
a string to a binary integer or a binary integer to a string in linear time,
5260+
*unless* the base is a power of 2. Even the best known algorithms for base 10
5261+
have sub-quadratic complexity. Converting a large value such as ``int('1' *
5262+
500_000)`` can take over a second on a fast CPU.
5263+
5264+
Limiting conversion size offers a practical way to avoid `CVE-2020-10735
5265+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
5266+
5267+
The limit is applied to the number of digit characters in the input or output
5268+
string when a non-linear conversion algorithm would be involved. Underscores
5269+
and the sign are not counted towards the limit.
5270+
5271+
When an operation would exceed the limit, a :exc:`ValueError` is raised:
5272+
5273+
.. doctest::
5274+
5275+
>>> import sys
5276+
>>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
5277+
>>> _ = int('2' * 5432)
5278+
Traceback (most recent call last):
5279+
...
5280+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
5281+
>>> i = int('2' * 4300)
5282+
>>> len(str(i))
5283+
4300
5284+
>>> i_squared = i*i
5285+
>>> len(str(i_squared))
5286+
Traceback (most recent call last):
5287+
...
5288+
ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
5289+
>>> len(hex(i_squared))
5290+
7144
5291+
>>> assert int(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.
5292+
5293+
The default limit is 4300 digits as provided in
5294+
:data:`sys.int_info.default_max_str_digits <sys.int_info>`.
5295+
The lowest limit that can be configured is 640 digits as provided in
5296+
:data:`sys.int_info.str_digits_check_threshold <sys.int_info>`.
5297+
5298+
Verification:
5299+
5300+
.. doctest::
5301+
5302+
>>> import sys
5303+
>>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
5304+
>>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
5305+
>>> msg = int('578966293710682886880994035146873798396722250538762761564'
5306+
... '9252925514383915483333812743580549779436104706260696366600'
5307+
... '571186405732').to_bytes(53, 'big')
5308+
...
5309+
5310+
.. versionadded:: 3.9.14
5311+
5312+
Affected APIs
5313+
-------------
5314+
5315+
The limitation only applies to potentially slow conversions between :class:`int`
5316+
and :class:`str` or :class:`bytes`:
5317+
5318+
* ``int(string)`` with default base 10.
5319+
* ``int(string, base)`` for all bases that are not a power of 2.
5320+
* ``str(integer)``.
5321+
* ``repr(integer)``
5322+
* any other string conversion to base 10, for example ``f"{integer}"``,
5323+
``"{}".format(integer)``, or ``b"%d" % integer``.
5324+
5325+
The limitations do not apply to functions with a linear algorithm:
5326+
5327+
* ``int(string, base)`` with base 2, 4, 8, 16, or 32.
5328+
* :func:`int.from_bytes` and :func:`int.to_bytes`.
5329+
* :func:`hex`, :func:`oct`, :func:`bin`.
5330+
* :ref:`formatspec` for hex, octal, and binary numbers.
5331+
* :class:`str` to :class:`float`.
5332+
* :class:`str` to :class:`decimal.Decimal`.
5333+
5334+
Configuring the limit
5335+
---------------------
5336+
5337+
Before Python starts up you can use an environment variable or an interpreter
5338+
command line flag to configure the limit:
5339+
5340+
* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
5341+
``PYTHONINTMAXSTRDIGITS=640 python3`` to set the limit to 640 or
5342+
``PYTHONINTMAXSTRDIGITS=0 python3`` to disable the limitation.
5343+
* :option:`-X int_max_str_digits <-X>`, e.g.
5344+
``python3 -X int_max_str_digits=640``
5345+
* :data:`sys.flags.int_max_str_digits` contains the value of
5346+
:envvar:`PYTHONINTMAXSTRDIGITS` or :option:`-X int_max_str_digits <-X>`.
5347+
If both the env var and the ``-X`` option are set, the ``-X`` option takes
5348+
precedence. A value of *-1* indicates that both were unset, thus a value of
5349+
:data:`sys.int_info.default_max_str_digits` was used during initilization.
5350+
5351+
From code, you can inspect the current limit and set a new one using these
5352+
:mod:`sys` APIs:
5353+
5354+
* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
5355+
a getter and setter for the interpreter-wide limit. Subinterpreters have
5356+
their own limit.
5357+
5358+
Information about the default and minimum can be found in :attr:`sys.int_info`:
5359+
5360+
* :data:`sys.int_info.default_max_str_digits <sys.int_info>` is the compiled-in
5361+
default limit.
5362+
* :data:`sys.int_info.str_digits_check_threshold <sys.int_info>` is the lowest
5363+
accepted value for the limit (other than 0 which disables it).
5364+
5365+
.. versionadded:: 3.9.14
5366+
5367+
.. caution::
5368+
5369+
Setting a low limit *can* lead to problems. While rare, code exists that
5370+
contains integer constants in decimal in their source that exceed the
5371+
minimum threshold. A consequence of setting the limit is that Python source
5372+
code containing decimal integer literals longer than the limit will
5373+
encounter an error during parsing, usually at startup time or import time or
5374+
even at installation time - anytime an up to date ``.pyc`` does not already
5375+
exist for the code. A workaround for source that contains such large
5376+
constants is to convert them to ``0x`` hexadecimal form as it has no limit.
5377+
5378+
Test your application thoroughly if you use a low limit. Ensure your tests
5379+
run with the limit set early via the environment or flag so that it applies
5380+
during startup and even during any installation step that may invoke Python
5381+
to precompile ``.py`` sources to ``.pyc`` files.
5382+
5383+
Recommended configuration
5384+
-------------------------
5385+
5386+
The default :data:`sys.int_info.default_max_str_digits` is expected to be
5387+
reasonable for most applications. If your application requires a different
5388+
limit, set it from your main entry point using Python version agnostic code as
5389+
these APIs were added in security patch releases in versions before 3.11.
5390+
5391+
Example::
5392+
5393+
>>> import sys
5394+
>>> if hasattr(sys, "set_int_max_str_digits"):
5395+
... upper_bound = 68000
5396+
... lower_bound = 4004
5397+
... current_limit = sys.get_int_max_str_digits()
5398+
... if current_limit == 0 or current_limit > upper_bound:
5399+
... sys.set_int_max_str_digits(upper_bound)
5400+
... elif current_limit < lower_bound:
5401+
... sys.set_int_max_str_digits(lower_bound)
5402+
5403+
If you need to disable it entirely, set it to ``0``.
5404+
5405+
52475406
.. rubric:: Footnotes
52485407

52495408
.. [1] Additional information on these special methods may be found in the Python

Doc/library/sys.rst

+46-13
Original file line numberDiff line numberDiff line change
@@ -443,9 +443,9 @@ always available.
443443
The :term:`named tuple` *flags* exposes the status of command line
444444
flags. The attributes are read only.
445445

446-
============================= ================================================================
446+
============================= ==============================================================================================================
447447
attribute flag
448-
============================= ================================================================
448+
============================= ==============================================================================================================
449449
:const:`debug` :option:`-d`
450450
:const:`inspect` :option:`-i`
451451
:const:`interactive` :option:`-i`
@@ -461,7 +461,8 @@ always available.
461461
:const:`hash_randomization` :option:`-R`
462462
:const:`dev_mode` :option:`-X dev <-X>` (:ref:`Python Development Mode <devmode>`)
463463
:const:`utf8_mode` :option:`-X utf8 <-X>`
464-
============================= ================================================================
464+
:const:`int_max_str_digits` :option:`-X int_max_str_digits <-X>` (:ref:`integer string conversion length limitation <int_max_str_digits>`)
465+
============================= ==============================================================================================================
465466

466467
.. versionchanged:: 3.2
467468
Added ``quiet`` attribute for the new :option:`-q` flag.
@@ -480,6 +481,9 @@ always available.
480481
Mode <devmode>` and the ``utf8_mode`` attribute for the new :option:`-X`
481482
``utf8`` flag.
482483

484+
.. versionchanged:: 3.9.14
485+
Added the ``int_max_str_digits`` attribute.
486+
483487

484488
.. data:: float_info
485489

@@ -658,6 +662,15 @@ always available.
658662

659663
.. versionadded:: 3.6
660664

665+
666+
.. function:: get_int_max_str_digits()
667+
668+
Returns the current value for the :ref:`integer string conversion length
669+
limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.
670+
671+
.. versionadded:: 3.9.14
672+
673+
661674
.. function:: getrefcount(object)
662675

663676
Return the reference count of the *object*. The count returned is generally one
@@ -931,19 +944,31 @@ always available.
931944

932945
.. tabularcolumns:: |l|L|
933946

934-
+-------------------------+----------------------------------------------+
935-
| Attribute | Explanation |
936-
+=========================+==============================================+
937-
| :const:`bits_per_digit` | number of bits held in each digit. Python |
938-
| | integers are stored internally in base |
939-
| | ``2**int_info.bits_per_digit`` |
940-
+-------------------------+----------------------------------------------+
941-
| :const:`sizeof_digit` | size in bytes of the C type used to |
942-
| | represent a digit |
943-
+-------------------------+----------------------------------------------+
947+
+----------------------------------------+-----------------------------------------------+
948+
| Attribute | Explanation |
949+
+========================================+===============================================+
950+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
951+
| | integers are stored internally in base |
952+
| | ``2**int_info.bits_per_digit`` |
953+
+----------------------------------------+-----------------------------------------------+
954+
| :const:`sizeof_digit` | size in bytes of the C type used to |
955+
| | represent a digit |
956+
+----------------------------------------+-----------------------------------------------+
957+
| :const:`default_max_str_digits` | default value for |
958+
| | :func:`sys.get_int_max_str_digits` when it |
959+
| | is not otherwise explicitly configured. |
960+
+----------------------------------------+-----------------------------------------------+
961+
| :const:`str_digits_check_threshold` | minimum non-zero value for |
962+
| | :func:`sys.set_int_max_str_digits`, |
963+
| | :envvar:`PYTHONINTMAXSTRDIGITS`, or |
964+
| | :option:`-X int_max_str_digits <-X>`. |
965+
+----------------------------------------+-----------------------------------------------+
944966

945967
.. versionadded:: 3.1
946968

969+
.. versionchanged:: 3.9.14
970+
Added ``default_max_str_digits`` and ``str_digits_check_threshold``.
971+
947972

948973
.. data:: __interactivehook__
949974

@@ -1221,6 +1246,14 @@ always available.
12211246

12221247
.. availability:: Unix.
12231248

1249+
.. function:: set_int_max_str_digits(n)
1250+
1251+
Set the :ref:`integer string conversion length limitation
1252+
<int_max_str_digits>` used by this interpreter. See also
1253+
:func:`get_int_max_str_digits`.
1254+
1255+
.. versionadded:: 3.9.14
1256+
12241257
.. function:: setprofile(profilefunc)
12251258

12261259
.. index::

Doc/library/test.rst

+10
Original file line numberDiff line numberDiff line change
@@ -1302,6 +1302,16 @@ The :mod:`test.support` module defines the following functions:
13021302
.. versionadded:: 3.6
13031303

13041304

1305+
.. function:: adjust_int_max_str_digits(max_digits)
1306+
1307+
This function returns a context manager that will change the global
1308+
:func:`sys.set_int_max_str_digits` setting for the duration of the
1309+
context to allow execution of test code that needs a different limit
1310+
on the number of digits when converting between an integer and string.
1311+
1312+
.. versionadded:: 3.9.14
1313+
1314+
13051315
The :mod:`test.support` module defines the following classes:
13061316

13071317
.. class:: TransientResource(exc, **kwargs)

Doc/using/cmdline.rst

+13
Original file line numberDiff line numberDiff line change
@@ -436,6 +436,9 @@ Miscellaneous options
436436
stored in a traceback of a trace. Use ``-X tracemalloc=NFRAME`` to start
437437
tracing with a traceback limit of *NFRAME* frames. See the
438438
:func:`tracemalloc.start` for more information.
439+
* ``-X int_max_str_digits`` configures the :ref:`integer string conversion
440+
length limitation <int_max_str_digits>`. See also
441+
:envvar:`PYTHONINTMAXSTRDIGITS`.
439442
* ``-X importtime`` to show how long each import takes. It shows module
440443
name, cumulative time (including nested imports) and self time (excluding
441444
nested imports). Note that its output may be broken in multi-threaded
@@ -480,6 +483,9 @@ Miscellaneous options
480483

481484
The ``-X showalloccount`` option has been removed.
482485

486+
.. versionadded:: 3.9.14
487+
The ``-X int_max_str_digits`` option.
488+
483489
.. deprecated-removed:: 3.9 3.10
484490
The ``-X oldparser`` option.
485491

@@ -659,6 +665,13 @@ conflict.
659665

660666
.. versionadded:: 3.2.3
661667

668+
.. envvar:: PYTHONINTMAXSTRDIGITS
669+
670+
If this variable is set to an integer, it is used to configure the
671+
interpreter's global :ref:`integer string conversion length limitation
672+
<int_max_str_digits>`.
673+
674+
.. versionadded:: 3.9.14
662675

663676
.. envvar:: PYTHONIOENCODING
664677

Doc/whatsnew/3.9.rst

+14
Original file line numberDiff line numberDiff line change
@@ -1587,3 +1587,17 @@ URL by the parser in :mod:`urllib.parse` preventing such attacks. The removal
15871587
characters are controlled by a new module level variable
15881588
``urllib.parse._UNSAFE_URL_BYTES_TO_REMOVE``. (See :issue:`43882`)
15891589

1590+
Notable security feature in 3.9.14
1591+
==================================
1592+
1593+
Converting between :class:`int` and :class:`str` in bases other than 2
1594+
(binary), 4, 8 (octal), 16 (hexadecimal), or 32 such as base 10 (decimal)
1595+
now raises a :exc:`ValueError` if the number of digits in string form is
1596+
above a limit to avoid potential denial of service attacks due to the
1597+
algorithmic complexity. This is a mitigation for `CVE-2020-10735
1598+
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
1599+
This limit can be configured or disabled by environment variable, command
1600+
line flag, or :mod:`sys` APIs. See the :ref:`integer string conversion
1601+
length limitation <int_max_str_digits>` documentation. The default limit
1602+
is 4300 digits in string form.
1603+

0 commit comments

Comments
 (0)