From 4c18db7d5e38d7094948c8f1136544161c1e9dec Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marta=20G=C3=B3mez=20Mac=C3=ADas?= Date: Wed, 24 May 2023 00:19:01 +0200 Subject: [PATCH 1/7] Add changes related to PEP 701 in 3.12 What's New docs --- Doc/whatsnew/3.12.rst | 61 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index 5e07a4caeb9ebe..3b526648f01e6c 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -177,6 +177,50 @@ Inlining does result in a few visible behavior changes: Contributed by Carl Meyer and Vladimir Matveev in :pep:`709`. +PEP 701: Syntactic formalization of f-strings +--------------------------------------------- + +:pep:`701` lifts some restrictions in the usage of f-strings. The expressions +inside f-strings can now be any Python expression: including backslashes, +unicode characters, multi-line expressions, comments and strings reusing the +same quote. Let's see some examples: + +* Quote reuse: in Python 3.11, reusing the same quotes as the f-string raised a + :exc:`SyntaxError`, forcing the user to either use single quotes or escape the quotes + inside the expression. In Python 3.12, you can now do things like this: + + >>> things = ['a', 'b', 'c'] + >>> f"These are the things: {", ".join(things)}" + 'These are the things: a, b, c' + +* Multi-line expressions and comments: In Python 3.11, f-strings expressions + must be defined in a single line, making them harder to read. In Python 3.12 + you can now define expressions in multiple lines and include comments on them: + + >>> f"These are the things: {", ".join([ + ... 'a', # A + ... 'b', # B + ... 'c' # C + ... ])}" + 'These are the things: a, b, c' + +* Backslashes and unicode characters: f-string expressions couldn't contain + any \ character (except for escaping quotes). This also affected unicode + characters (such as `\N{snowman}`). Now, you can define expressions like this: + + >>> print(f"These are the things: {"\n".join(things)}") + These are the things: a + b + c + >>> print(f"These are the things: {"\N{snowman}".join(things)}") + These are the things: a☃b☃c + +See :pep:`701` for more details. + +(Contributed by Pablo Galindo, Batuhan Taskaya, Lysandros Nikolaou, Cristián +Maureira-Fredes and Marta Gómez in :gh:`102856`. PEP written by Pablo Galindo, +Batuhan Taskaya and Lysandros Nikolaou) + PEP 688: Making the buffer protocol accessible in Python -------------------------------------------------------- @@ -298,6 +342,12 @@ array * The :class:`array.array` class now supports subscripting, making it a :term:`generic type`. (Contributed by Jelle Zijlstra in :gh:`98658`.) +tokenize +-------- + +* The :mod:`tokenize` module includes the changes introduced in :pep:`701`. ( + Contributed by Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) + asyncio ------- @@ -687,6 +737,10 @@ Optimizations * Speed up :class:`asyncio.Task` creation by deferring expensive string formatting. (Contributed by Itamar O in :gh:`103793`.) +* The :func:`tokenize.tokenize` and :func:`tokenize.generate_tokens` calls are + now 64% faster due to the changes introduced in :pep:`701`. (Contributed by + Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) + CPython bytecode changes ======================== @@ -1201,6 +1255,13 @@ Changes in the Python API that may be surprising or dangerous. See :ref:`tarfile-extraction-filter` for details. +* The behaviour of :func:`tokenize.tokenize` and + :func:`tokenize.generate_tokens` is now changed due to the changes introduced + in :pep:`701`. In addition to that, final ``DEDENT`` tokens are now within + the file bounds. This means that for a file containing 3 lines, the old + tokenizer returned a ``DEDENT`` token in line 4 whilst the new tokenizer + returns it in line 3. + Build Changes ============= From 1182d56b2307d0b3e1f4c30bedd9b60c6c54172e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Marta=20G=C3=B3mez=20Mac=C3=ADas?= Date: Wed, 24 May 2023 01:13:48 +0200 Subject: [PATCH 2/7] Fix compilation issue and improve text --- Doc/whatsnew/3.12.rst | 194 ++++++++++++++++++++++++++---------------- 1 file changed, 119 insertions(+), 75 deletions(-) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index 3b526648f01e6c..fcd682b49f6ef5 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -136,22 +136,70 @@ Improved Error Messages New Features ============ -* Add :ref:`perf_profiling` through the new - environment variable :envvar:`PYTHONPERFSUPPORT`, - the new command-line option :option:`-X perf <-X>`, - as well as the new :func:`sys.activate_stack_trampoline`, - :func:`sys.deactivate_stack_trampoline`, - and :func:`sys.is_stack_trampoline_active` APIs. - (Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes - with contributions from Gregory P. Smith [Google] and Mark Shannon - in :gh:`96123`.) -* The extraction methods in :mod:`tarfile`, and :func:`shutil.unpack_archive`, - have a new a *filter* argument that allows limiting tar features than may be - surprising or dangerous, such as creating files outside the destination - directory. - See :ref:`tarfile-extraction-filter` for details. - In Python 3.14, the default will switch to ``'data'``. - (Contributed by Petr Viktorin in :pep:`706`.) +.. _whatsnew312-pep701: + +PEP 701: Syntactic formalization of f-strings +--------------------------------------------- + +:pep:`701` lifts some restrictions on the usage of f-strings. Expression components +inside f-strings can now be any valid Python expression including backslashes, +unicode escaped sequences, multi-line expressions, comments and strings reusing the +same quote as the containing f-string. Let's cover these in detail: + +* Quote reuse: in Python 3.11, reusing the same quotes as the contaning f-string + raises a :exc:`SyntaxError`, forcing the user to either use other available + quotes (like using double quotes or triple quites if the f-string uses single + quites). In Python 3.12, you can now do things like this: + + >>> songs = ['Take me back to Eden', 'Alkaline', 'Ascensionism'] + >>> f"This is the playlist: {", ".join(things)}" + 'This is the playlist: Take me back to Eden, Alkaline, Ascensionism + + Note that before this change there was no explicit limit in how f-strings can + be nested, but the fact that string quotes cannot be reused inside the + expression component of f-strings made it impossible to nest f-strings + arbitrarily. In fact, this is the most nested-fstring that can be written: + + >>> f"""{f'''{f'{f"{1+1}"}'}'''}""" + '2' + + As now f-strings can contain any valid Python expression inside expression + components, it is now possible to nest f-strings arbitrarily: + + >>> f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}" + '2' + +* Multi-line expressions and comments: In Python 3.11, f-strings expressions + must be defined in a single line even if outside f-strings expressions could + span multiple lines (like literal lists being defined over multiple lines), + making them harder to read. In Python 3.12 you can now define expressions + spaning multiple lines and include comments on them: + + >>> f"This is the playlist: {", ".join([ + ... 'Take me back to Eden', # My, my, those eyes like fire + ... 'Alkaline', # Not acid nor alkaline + ... 'Ascensionism' # Take to the broken skies at last + ... ])}" + 'This is the playlist: Take me back to Eden, Alkaline, Ascensionism' + +* Backslashes and unicode characters: before Python 3.12 f-string expressions + couldn't contain any ``\`` character. This also affected unicode escaped + sequences (such as ``\N{snowman}``) as these contain the ``\N`` part that + previously could not be part of expression components of f-strings. Now, you + can define expressions like this: + + >>> print(f"This is the playlist: {"\n".join(songs)}") + This is the playlist: Take me back to Eden + Alkaline + Ascensionism + >>> print(f"This is the playlist: {"\N{BLACK HEART SUIT}".join(songs)}") + This is the playlist: Take me back to Eden♥Alkaline♥Ascensionism + +See :pep:`701` for more details. + +(Contributed by Pablo Galindo, Batuhan Taskaya, Lysandros Nikolaou, Cristián +Maureira-Fredes and Marta Gómez in :gh:`102856`. PEP written by Pablo Galindo, +Batuhan Taskaya, Lysandros Nikolaou and Marta Gómez). .. _whatsnew312-pep709: @@ -177,50 +225,6 @@ Inlining does result in a few visible behavior changes: Contributed by Carl Meyer and Vladimir Matveev in :pep:`709`. -PEP 701: Syntactic formalization of f-strings ---------------------------------------------- - -:pep:`701` lifts some restrictions in the usage of f-strings. The expressions -inside f-strings can now be any Python expression: including backslashes, -unicode characters, multi-line expressions, comments and strings reusing the -same quote. Let's see some examples: - -* Quote reuse: in Python 3.11, reusing the same quotes as the f-string raised a - :exc:`SyntaxError`, forcing the user to either use single quotes or escape the quotes - inside the expression. In Python 3.12, you can now do things like this: - - >>> things = ['a', 'b', 'c'] - >>> f"These are the things: {", ".join(things)}" - 'These are the things: a, b, c' - -* Multi-line expressions and comments: In Python 3.11, f-strings expressions - must be defined in a single line, making them harder to read. In Python 3.12 - you can now define expressions in multiple lines and include comments on them: - - >>> f"These are the things: {", ".join([ - ... 'a', # A - ... 'b', # B - ... 'c' # C - ... ])}" - 'These are the things: a, b, c' - -* Backslashes and unicode characters: f-string expressions couldn't contain - any \ character (except for escaping quotes). This also affected unicode - characters (such as `\N{snowman}`). Now, you can define expressions like this: - - >>> print(f"These are the things: {"\n".join(things)}") - These are the things: a - b - c - >>> print(f"These are the things: {"\N{snowman}".join(things)}") - These are the things: a☃b☃c - -See :pep:`701` for more details. - -(Contributed by Pablo Galindo, Batuhan Taskaya, Lysandros Nikolaou, Cristián -Maureira-Fredes and Marta Gómez in :gh:`102856`. PEP written by Pablo Galindo, -Batuhan Taskaya and Lysandros Nikolaou) - PEP 688: Making the buffer protocol accessible in Python -------------------------------------------------------- @@ -264,6 +268,24 @@ See :pep:`692` for more details. (PEP written by Franek Magiera) + +* Add :ref:`perf_profiling` through the new + environment variable :envvar:`PYTHONPERFSUPPORT`, + the new command-line option :option:`-X perf <-X>`, + as well as the new :func:`sys.activate_stack_trampoline`, + :func:`sys.deactivate_stack_trampoline`, + and :func:`sys.is_stack_trampoline_active` APIs. + (Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes + with contributions from Gregory P. Smith [Google] and Mark Shannon + in :gh:`96123`.) +* The extraction methods in :mod:`tarfile`, and :func:`shutil.unpack_archive`, + have a new a *filter* argument that allows limiting tar features than may be + surprising or dangerous, such as creating files outside the destination + directory. + See :ref:`tarfile-extraction-filter` for details. + In Python 3.14, the default will switch to ``'data'``. + (Contributed by Petr Viktorin in :pep:`706`.) + Other Language Changes ====================== @@ -342,12 +364,6 @@ array * The :class:`array.array` class now supports subscripting, making it a :term:`generic type`. (Contributed by Jelle Zijlstra in :gh:`98658`.) -tokenize --------- - -* The :mod:`tokenize` module includes the changes introduced in :pep:`701`. ( - Contributed by Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) - asyncio ------- @@ -593,6 +609,12 @@ tkinter like ``create_*()`` methods. (Contributed by Serhiy Storchaka in :gh:`94473`.) +tokenize +-------- + +* The :mod:`tokenize` module includes the changes introduced in :pep:`701`. ( + Contributed by Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) + types ----- @@ -737,9 +759,10 @@ Optimizations * Speed up :class:`asyncio.Task` creation by deferring expensive string formatting. (Contributed by Itamar O in :gh:`103793`.) -* The :func:`tokenize.tokenize` and :func:`tokenize.generate_tokens` calls are - now 64% faster due to the changes introduced in :pep:`701`. (Contributed by - Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) +* The :func:`tokenize.tokenize` and :func:`tokenize.generate_tokens` functions are + up to 64% faster as a side effect of the changes required to cover :pep:`701` in + the :mod:`tokenize` module. (Contributed by Marta Gómez Macías and Pablo Galindo + in :gh:`102856`.) CPython bytecode changes @@ -1255,12 +1278,33 @@ Changes in the Python API that may be surprising or dangerous. See :ref:`tarfile-extraction-filter` for details. -* The behaviour of :func:`tokenize.tokenize` and - :func:`tokenize.generate_tokens` is now changed due to the changes introduced - in :pep:`701`. In addition to that, final ``DEDENT`` tokens are now within - the file bounds. This means that for a file containing 3 lines, the old - tokenizer returned a ``DEDENT`` token in line 4 whilst the new tokenizer - returns it in line 3. +* The output of the :func:`tokenize.tokenize` and :func:`tokenize.generate_tokens` + functions is now changed due to the changes introduced in :pep:`701`. This + means that ``STRING`` tokens are not emited anymore for f-strings and the + tokens described in :pep:`701` are now produced instead: ``FSTRING_START``, + ``FSRING_MIDDLE`` and ``FSTRING_END`` are now emited for f-string "string" + parts in addition to the the apropiate tokens for the tokenization in the + expression components. For example for the f-string ``f"start {1+1} end"`` + the old version of the tokenizer emitted:: + + 1,0-1,18: STRING 'f"start {1+1} end"' + + while the new version emits:: + + 1,0-1,2: FSTRING_START 'f"' + 1,2-1,8: FSTRING_MIDDLE 'start ' + 1,8-1,9: OP '{' + 1,9-1,10: NUMBER '1' + 1,10-1,11: OP '+' + 1,11-1,12: NUMBER '1' + 1,12-1,13: OP '}' + 1,13-1,17: FSTRING_MIDDLE ' end' + 1,17-1,18: FSTRING_END '"' + + Aditionally, final ``DEDENT`` tokens are now emited within the bounds of the + input. This means that for a file containing 3 lines, the old version of the + tokenizer returned a ``DEDENT`` token in line 4 whilst the new version returns + the token in line 3. Build Changes ============= From 2c3ab8189c7487ff59ffbfde4d217ce7ad47cc95 Mon Sep 17 00:00:00 2001 From: Pablo Galindo Salgado Date: Wed, 24 May 2023 00:33:35 +0100 Subject: [PATCH 3/7] Update Doc/whatsnew/3.12.rst --- Doc/whatsnew/3.12.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index fcd682b49f6ef5..ad11e09cdc56aa 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -152,7 +152,7 @@ same quote as the containing f-string. Let's cover these in detail: quites). In Python 3.12, you can now do things like this: >>> songs = ['Take me back to Eden', 'Alkaline', 'Ascensionism'] - >>> f"This is the playlist: {", ".join(things)}" + >>> f"This is the playlist: {", ".join(songs)}" 'This is the playlist: Take me back to Eden, Alkaline, Ascensionism Note that before this change there was no explicit limit in how f-strings can From 93a9286c1092fb73565d113322cfcbf0648cee98 Mon Sep 17 00:00:00 2001 From: Pablo Galindo Salgado Date: Wed, 24 May 2023 09:52:33 +0100 Subject: [PATCH 4/7] Apply suggestions from code review Co-authored-by: Jelle Zijlstra --- Doc/whatsnew/3.12.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index ad11e09cdc56aa..697f67686292c3 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -146,10 +146,10 @@ inside f-strings can now be any valid Python expression including backslashes, unicode escaped sequences, multi-line expressions, comments and strings reusing the same quote as the containing f-string. Let's cover these in detail: -* Quote reuse: in Python 3.11, reusing the same quotes as the contaning f-string +* Quote reuse: in Python 3.11, reusing the same quotes as the containing f-string raises a :exc:`SyntaxError`, forcing the user to either use other available - quotes (like using double quotes or triple quites if the f-string uses single - quites). In Python 3.12, you can now do things like this: + quotes (like using double quotes or triple quotes if the f-string uses single + quotes). In Python 3.12, you can now do things like this: >>> songs = ['Take me back to Eden', 'Alkaline', 'Ascensionism'] >>> f"This is the playlist: {", ".join(songs)}" @@ -158,7 +158,7 @@ same quote as the containing f-string. Let's cover these in detail: Note that before this change there was no explicit limit in how f-strings can be nested, but the fact that string quotes cannot be reused inside the expression component of f-strings made it impossible to nest f-strings - arbitrarily. In fact, this is the most nested-fstring that can be written: + arbitrarily. In fact, this is the most nested f-string that could be written: >>> f"""{f'''{f'{f"{1+1}"}'}'''}""" '2' @@ -1280,10 +1280,10 @@ Changes in the Python API * The output of the :func:`tokenize.tokenize` and :func:`tokenize.generate_tokens` functions is now changed due to the changes introduced in :pep:`701`. This - means that ``STRING`` tokens are not emited anymore for f-strings and the + means that ``STRING`` tokens are not emitted any more for f-strings and the tokens described in :pep:`701` are now produced instead: ``FSTRING_START``, - ``FSRING_MIDDLE`` and ``FSTRING_END`` are now emited for f-string "string" - parts in addition to the the apropiate tokens for the tokenization in the + ``FSRING_MIDDLE`` and ``FSTRING_END`` are now emitted for f-string "string" + parts in addition to the appropriate tokens for the tokenization in the expression components. For example for the f-string ``f"start {1+1} end"`` the old version of the tokenizer emitted:: @@ -1301,7 +1301,7 @@ Changes in the Python API 1,13-1,17: FSTRING_MIDDLE ' end' 1,17-1,18: FSTRING_END '"' - Aditionally, final ``DEDENT`` tokens are now emited within the bounds of the + Aditionally, final ``DEDENT`` tokens are now emitted within the bounds of the input. This means that for a file containing 3 lines, the old version of the tokenizer returned a ``DEDENT`` token in line 4 whilst the new version returns the token in line 3. From 08036e526081fa8fcd9ee537b61a75a82986b6a9 Mon Sep 17 00:00:00 2001 From: Pablo Galindo Date: Wed, 24 May 2023 10:14:08 +0100 Subject: [PATCH 5/7] Make some minor aditions to the tokenize changes --- Doc/whatsnew/3.12.rst | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index 697f67686292c3..533b5760e4b4bb 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -614,6 +614,8 @@ tokenize * The :mod:`tokenize` module includes the changes introduced in :pep:`701`. ( Contributed by Marta Gómez Macías and Pablo Galindo in :gh:`102856`.) + See :ref:`whatsnew312-porting-to-python312` for more information on the + changes to the :mod:`tokenize` module. types ----- @@ -1207,6 +1209,8 @@ Removed Iceape, Firebird, and Firefox versions 35 and below (:gh:`102871`). +.. _whatsnew312-porting-to-python312: + Porting to Python 3.12 ====================== @@ -1301,10 +1305,16 @@ Changes in the Python API 1,13-1,17: FSTRING_MIDDLE ' end' 1,17-1,18: FSTRING_END '"' - Aditionally, final ``DEDENT`` tokens are now emitted within the bounds of the - input. This means that for a file containing 3 lines, the old version of the - tokenizer returned a ``DEDENT`` token in line 4 whilst the new version returns - the token in line 3. + Aditionally, there may be some minor behavioral changes as a consecuence of the + changes required to support :pep:`701`. Some of these changes include: + + * Some final ``DEDENT`` tokens are now emitted within the bounds of the + input. This means that for a file containing 3 lines, the old version of the + tokenizer returned a ``DEDENT`` token in line 4 whilst the new version returns + the token in line 3. + + * The ``type`` attribute of the tokens emitted when tokenizing some invalid Python + characters such as ``!`` has changed from ``ERRORTOKEN`` to ``OP``. Build Changes ============= From 0ab25a6e85f76d86b703b73a18e14f3733ebbc2e Mon Sep 17 00:00:00 2001 From: Pablo Galindo Date: Wed, 24 May 2023 10:20:07 +0100 Subject: [PATCH 6/7] Fix indent --- Doc/whatsnew/3.12.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index 533b5760e4b4bb..b999beb8d007de 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -268,6 +268,8 @@ See :pep:`692` for more details. (PEP written by Franek Magiera) +Other Language Changes +====================== * Add :ref:`perf_profiling` through the new environment variable :envvar:`PYTHONPERFSUPPORT`, @@ -278,6 +280,7 @@ See :pep:`692` for more details. (Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes with contributions from Gregory P. Smith [Google] and Mark Shannon in :gh:`96123`.) + * The extraction methods in :mod:`tarfile`, and :func:`shutil.unpack_archive`, have a new a *filter* argument that allows limiting tar features than may be surprising or dangerous, such as creating files outside the destination @@ -286,9 +289,6 @@ See :pep:`692` for more details. In Python 3.14, the default will switch to ``'data'``. (Contributed by Petr Viktorin in :pep:`706`.) -Other Language Changes -====================== - * :class:`types.MappingProxyType` instances are now hashable if the underlying mapping is hashable. (Contributed by Serhiy Storchaka in :gh:`87995`.) From e729d66e9240c6f69f1b0501c2a5cc5b49a180f5 Mon Sep 17 00:00:00 2001 From: Pablo Galindo Date: Wed, 24 May 2023 10:22:30 +0100 Subject: [PATCH 7/7] Add pep701 to the index --- Doc/whatsnew/3.12.rst | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst index b999beb8d007de..47b01da58ecb1c 100644 --- a/Doc/whatsnew/3.12.rst +++ b/Doc/whatsnew/3.12.rst @@ -66,6 +66,10 @@ Summary -- Release highlights .. PEP-sized items next. +New grammar features: + +* :pep:`701`: Syntactic formalization of f-strings + New typing features: * :pep:`688`: Making the buffer protocol accessible in Python