From d4381aa9bfc80fe0f3d9530bc32aba8df47caa07 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 4 Jun 2025 18:01:25 +0200 Subject: [PATCH 1/8] WIP: String literals Co-authored-by: Blaise Pabon --- Doc/reference/expressions.rst | 50 +++++- Doc/reference/lexical_analysis.rst | 255 ++++++++++++++++++----------- 2 files changed, 207 insertions(+), 98 deletions(-) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 17f39aaf5f57cd..743d43b1c9c1b1 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -133,13 +133,18 @@ Literals Python supports string and bytes literals and various numeric literals: -.. productionlist:: python-grammar - literal: `stringliteral` | `bytesliteral` | `NUMBER` +.. grammar-snippet:: + :group: python-grammar + + literal: `strings` | `NUMBER` Evaluation of a literal yields an object of the given type (string, bytes, integer, floating-point number, complex number) with the given value. The value may be approximated in the case of floating-point and imaginary (complex) -literals. See section :ref:`literals` for details. +literals. +See section :ref:`literals` for details. +Seee section :ref:`string-concatenation` for details on ``strings``. + .. index:: triple: immutable; data; type @@ -152,6 +157,45 @@ occurrence) may obtain the same object or a different object with the same value. +.. _string-concatenation: + +String literal concatenation +............................ + +Multiple adjacent string or bytes literals (delimited by whitespace), possibly +using different quoting conventions, are allowed, and their meaning is the same +as their concatenation. Thus, ``"hello" 'world'`` is equivalent to +``"helloworld"``. + +Formally: + +.. grammar-snippet:: + :group: python-grammar + + strings: ( `STRING` | `fstring` | `tstring`)+ + +Note that this feature is defined at the syntactical level, so it only works +with literals. +To concatenate string expressions at run time, the '+' operator may be used:: + + greeting = "Hello" + space = " " + name = "Blaise" + print(greeting + space + name) # not: print(greeting space name) + +Also note that literal concatenation can freely mix raw strings, +triple-quoted strings, and formatted or template string literals. +However, bytes literals may not be combined with string literals of any kind. + +This feature can be used to reduce the number of backslashes +needed, to split long strings conveniently across long lines, or even to add +comments to parts of strings, for example:: + + re.compile("[A-Za-z_]" # letter or underscore + "[A-Za-z0-9_]*" # letter, digit or underscore + ) + + .. _parenthesized: Parenthesized forms diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 567c70111c20ec..58c8b15cfe5499 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -106,6 +106,16 @@ If an encoding is declared, the encoding name must be recognized by Python encoding is used for all lexical analysis, including string literals, comments and identifiers. +All lexical analysis, including string literals, comments +and identifiers, works on Unicode text decoded using the source encoding. +Any Unicode code point, except the NUL control character, can appear in +Python source. + +.. grammar-snippet:: + :group: python-grammar + + source_character: + .. _explicit-joining: @@ -478,66 +488,104 @@ Literals are notations for constant values of some built-in types. .. index:: string literal, bytes literal, ASCII single: ' (single quote); string literal single: " (double quote); string literal - single: u'; string literal - single: u"; string literal .. _strings: String and Bytes literals ------------------------- -String literals are described by the following lexical definitions: +String literals are text enclosed in single quotes (``'``) or double +quotes (``"``). For example: -.. productionlist:: python-grammar - stringliteral: [`stringprefix`](`shortstring` | `longstring`) - stringprefix: "r" | "u" | "R" | "U" | "f" | "F" | "t" | "T" - : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" - : | "tr" | "Tr" | "tR" | "TR" | "rt" | "rT" | "Rt" | "RT" - shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"' - longstring: "'''" `longstringitem`* "'''" | '"""' `longstringitem`* '"""' - shortstringitem: `shortstringchar` | `stringescapeseq` - longstringitem: `longstringchar` | `stringescapeseq` - shortstringchar: - longstringchar: - stringescapeseq: "\" +.. code-block:: plain -.. productionlist:: python-grammar - bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) - bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" - shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' - longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' - shortbytesitem: `shortbyteschar` | `bytesescapeseq` - longbytesitem: `longbyteschar` | `bytesescapeseq` - shortbyteschar: - longbyteschar: - bytesescapeseq: "\" + "spam" + 'eggs' + +The quote used to start the literal also terminates it, so a string literal +can only contain the other quote (except with escape sequences, see below). +For example: + +.. code-block:: plain -One syntactic restriction not indicated by these productions is that whitespace -is not allowed between the :token:`~python-grammar:stringprefix` or -:token:`~python-grammar:bytesprefix` and the rest of the literal. The source -character set is defined by the encoding declaration; it is UTF-8 if no encoding -declaration is given in the source file; see section :ref:`encodings`. + 'Say "Hello", please.' + "Don't do that!" -.. index:: triple-quoted string, Unicode Consortium, raw string +Except for this limitation, the choice of quote character (``'`` or ``"``) +does not affect how the literal is parsed. + +.. index:: triple-quoted string single: """; string literal single: '''; string literal -In plain English: Both types of literals can be enclosed in matching single quotes -(``'``) or double quotes (``"``). They can also be enclosed in matching groups -of three single or double quotes (these are generally referred to as -*triple-quoted strings*). The backslash (``\``) character is used to give special -meaning to otherwise ordinary characters like ``n``, which means 'newline' when -escaped (``\n``). It can also be used to escape characters that otherwise have a -special meaning, such as newline, backslash itself, or the quote character. -See :ref:`escape sequences ` below for examples. +Triple-quoted strings +--------------------- + +Strings can also be enclosed in matching groups of three single or double +quotes. +These are generally referred to as :dfn:`triple-quoted strings`. + +In triple-quoted literals, unescaped newlines and quotes are allowed (and are +retained), except that three unescaped quotes in a row terminate the literal. +(Here, a *quote* is the character used to open the literal, that is, +either ``'`` or ``"``.) + +For example: + +.. code-block:: plain + + """This is a triple-quoted string with "quotes" inside.""" + + '''Another triple-quoted string. This one continues + on the next line.''' + +Escape sequences +---------------- + +Inside a string literal, the backslash (``\``) character introduces an +:dfn:`escape sequence`, which has special meaning depending on the character +after the backslash. +For example, ``\n`` denotes the 'newline' character, rather the two characters +``\`` and ``n``. +See :ref:`escape sequences ` below for a full list of such +sequences, and more details. + + +.. index:: + single: u'; string literal + single: u"; string literal + +String prefixes +--------------- + +String literals can have an optional :dfn:`prefix` that influences how the literal +is parsed, for example: + +.. code-block:: plain + + b"data" + f'{result=}' + +* ``r``: Raw string +* ``f``: "F-string" +* ``t``: "T-string" +* ``b``: Byte literal +* ``u``: No effect (allowed for backwards compatibility) + +Prefixes are case-insensitive (for example, ``B`` works the same as ``b``). +The ``r`` prefix can be combined with ``f``, ``t`` or ``b``, so ``fr``, +``rf``, ``tr``, ``rt``, ``br`` and ``rb`` are also valid prefixes. + .. index:: single: b'; bytes literal single: b"; bytes literal -Bytes literals are always prefixed with ``'b'`` or ``'B'``; they produce an -instance of the :class:`bytes` type instead of the :class:`str` type. They -may only contain ASCII characters; bytes with a numeric value of 128 or greater -must be expressed with escapes. +:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an +instance of the :class:`bytes` type instead of the :class:`str` type. +They may only contain ASCII characters; bytes with a numeric value of 128 +or greater must be expressed with escape sequences. +Similarly, a zero byte must be expressed using an escape sequence. + .. index:: single: r'; raw string literal @@ -546,9 +594,33 @@ must be expressed with escapes. Both string and bytes literals may optionally be prefixed with a letter ``'r'`` or ``'R'``; such constructs are called :dfn:`raw string literals` and :dfn:`raw bytes literals` respectively and treat backslashes as -literal characters. As a result, in raw string literals, ``'\U'`` and ``'\u'`` +literal characters. +As a result, in raw string literals, :ref:`escape sequences ` escapes are not treated specially. +Even in a raw literal, quotes can be escaped with a backslash, but the +backslash remains in the result; for example, ``r"\""`` is a valid string +literal consisting of two characters: a backslash and a double quote; ``r"\"`` +is not a valid string literal (even a raw string cannot end in an odd number of +backslashes). Specifically, *a raw literal cannot end in a single backslash* +(since the backslash would escape the following quote character). Note also +that a single backslash followed by a newline is interpreted as those two +characters as part of the literal, *not* as a line continuation. + + +.. index:: + single: f'; formatted string literal + single: f"; formatted string literal + +A string literal with ``'f'`` or ``'F'`` in its prefix is a +:dfn:`formatted string literal`; see :ref:`f-strings`. +Similarly, string literal with ``'t'`` or ``'T'`` in its prefix is a +:dfn:`template string literal`; see :ref:`t-strings`. + +The ``'f'`` or ``t`` may be combined with ``'r'`` to create a +:dfn:`raw formatted string` or :dfn:`raw template string`. +They may not be combined with ``'b'``, ``'u'``, or each other. + .. versionadded:: 3.3 The ``'rb'`` prefix of raw bytes literals has been added as a synonym of ``'br'``. @@ -557,18 +629,46 @@ escapes are not treated specially. to simplify the maintenance of dual Python 2.x and 3.x codebases. See :pep:`414` for more information. -.. index:: - single: f'; formatted string literal - single: f"; formatted string literal -A string literal with ``'f'`` or ``'F'`` in its prefix is a -:dfn:`formatted string literal`; see :ref:`f-strings`. The ``'f'`` may be -combined with ``'r'``, but not with ``'b'`` or ``'u'``, therefore raw -formatted strings are possible, but formatted bytes literals are not. +String literals, except "F-strings" and "T-strings", are described by the +following lexical definitions: + +.. grammar-snippet:: + :group: python-grammar + + STRING: stringliteral | bytesliteral | fstring | tstring + + stringliteral: [`stringprefix`](`stringcontent`) + stringprefix: <("r" | "u"), case-insensitive> + stringcontent: `quote` `stringitem`* + quote: "'" | '"' | "'''" | '"""' + stringitem: `stringchar` | `stringescapeseq` + stringchar: + stringescapeseq: "\" + +``stringchar`` can not include: + +- the backslash, ``\``; +- in triple-quoted strings (quoted by ``'''`` or ``"""``), the newline; +- the quote character. + + +.. grammar-snippet:: + :group: python-grammar + + bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) + bytesprefix: <("b" | "br" | "rb" ), case-insensitive> + shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' + longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' + shortbytesitem: `shortbyteschar` | `bytesescapeseq` + longbytesitem: `longbyteschar` | `bytesescapeseq` + shortbyteschar: + longbyteschar: + bytesescapeseq: "\" + +Note that as in all lexical definitions, whitespace is significant. +The prefix, if any, must be followed immediately by the quoted string content. -In triple-quoted literals, unescaped newlines and quotes are allowed (and are -retained), except that three unescaped quotes in a row terminate the literal. (A -"quote" is the character used to open the literal, i.e. either ``'`` or ``"``.) .. index:: physical line, escape sequence, Standard C, C single: \ (backslash); escape sequence @@ -587,7 +687,6 @@ retained), except that three unescaped quotes in a row terminate the literal. ( .. _escape-sequences: - Escape sequences ^^^^^^^^^^^^^^^^ @@ -655,14 +754,14 @@ Notes: (2) - As in Standard C, up to three octal digits are accepted. + As in Standard C, up to three octal digits (0 through 7) are accepted. .. versionchanged:: 3.11 - Octal escapes with value larger than ``0o377`` produce a + Octal escapes with value larger than ``0o377`` (255) produce a :exc:`DeprecationWarning`. .. versionchanged:: 3.12 - Octal escapes with value larger than ``0o377`` produce a + Octal escapes with value larger than ``0o377`` (255) produce a :exc:`SyntaxWarning`. In a future Python version they will be eventually a :exc:`SyntaxError`. @@ -689,11 +788,9 @@ Notes: .. index:: unrecognized escape sequence Unlike Standard C, all unrecognized escape sequences are left in the string -unchanged, i.e., *the backslash is left in the result*. (This behavior is -useful when debugging: if an escape sequence is mistyped, the resulting output -is more easily recognized as broken.) It is also important to note that the -escape sequences only recognized in string literals fall into the category of -unrecognized escapes for bytes literals. +unchanged, i.e., *the backslash is left in the result*. +Note that for bytes literals, the escape sequences only recognized in string +literals fall into the category of unrecognized escapes. .. versionchanged:: 3.6 Unrecognized escape sequences produce a :exc:`DeprecationWarning`. @@ -702,38 +799,6 @@ unrecognized escapes for bytes literals. Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future Python version they will be eventually a :exc:`SyntaxError`. -Even in a raw literal, quotes can be escaped with a backslash, but the -backslash remains in the result; for example, ``r"\""`` is a valid string -literal consisting of two characters: a backslash and a double quote; ``r"\"`` -is not a valid string literal (even a raw string cannot end in an odd number of -backslashes). Specifically, *a raw literal cannot end in a single backslash* -(since the backslash would escape the following quote character). Note also -that a single backslash followed by a newline is interpreted as those two -characters as part of the literal, *not* as a line continuation. - - -.. _string-concatenation: - -String literal concatenation ----------------------------- - -Multiple adjacent string or bytes literals (delimited by whitespace), possibly -using different quoting conventions, are allowed, and their meaning is the same -as their concatenation. Thus, ``"hello" 'world'`` is equivalent to -``"helloworld"``. This feature can be used to reduce the number of backslashes -needed, to split long strings conveniently across long lines, or even to add -comments to parts of strings, for example:: - - re.compile("[A-Za-z_]" # letter or underscore - "[A-Za-z0-9_]*" # letter, digit or underscore - ) - -Note that this feature is defined at the syntactical level, but implemented at -compile time. The '+' operator must be used to concatenate string expressions -at run time. Also note that literal concatenation can use different quoting -styles for each component (even mixing raw strings and triple quoted strings), -and formatted string literals may be concatenated with plain string literals. - .. index:: single: formatted string literal From 80ad85cc286f04a4ac19d03c5f99a9158d15231b Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 11 Jun 2025 16:22:08 +0200 Subject: [PATCH 2/8] Use correct Pygments lexer for plain text --- Doc/reference/lexical_analysis.rst | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 58c8b15cfe5499..6f3d90f89b98d3 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -496,7 +496,11 @@ String and Bytes literals String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: -.. code-block:: plain +.. This is Python code, but we turn off highlighting because as of this + writing, highlighted strings don't look good when there's no code + surrounding them. + +.. code-block:: text "spam" 'eggs' @@ -505,7 +509,7 @@ The quote used to start the literal also terminates it, so a string literal can only contain the other quote (except with escape sequences, see below). For example: -.. code-block:: plain +.. code-block:: text 'Say "Hello", please.' "Don't do that!" @@ -531,7 +535,7 @@ either ``'`` or ``"``.) For example: -.. code-block:: plain +.. code-block:: text """This is a triple-quoted string with "quotes" inside.""" @@ -560,7 +564,7 @@ String prefixes String literals can have an optional :dfn:`prefix` that influences how the literal is parsed, for example: -.. code-block:: plain +.. code-block:: python b"data" f'{result=}' From e44fa66cf2da63763a3ed37f7d59da28e95c785c Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 11 Jun 2025 17:59:01 +0200 Subject: [PATCH 3/8] WIP --- Doc/reference/grammar.rst | 5 +- Doc/reference/introduction.rst | 16 +++-- Doc/reference/lexical_analysis.rst | 110 +++++++++++++++++------------ 3 files changed, 76 insertions(+), 55 deletions(-) diff --git a/Doc/reference/grammar.rst b/Doc/reference/grammar.rst index 55c148801d8559..1037feb691f6bc 100644 --- a/Doc/reference/grammar.rst +++ b/Doc/reference/grammar.rst @@ -10,11 +10,8 @@ error recovery. The notation used here is the same as in the preceding docs, and is described in the :ref:`notation ` section, -except for a few extra complications: +except for an extra complication: -* ``&e``: a positive lookahead (that is, ``e`` is required to match but - not consumed) -* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match) * ``~`` ("cut"): commit to the current alternative and fail the rule even if this fails to parse diff --git a/Doc/reference/introduction.rst b/Doc/reference/introduction.rst index 444acac374a690..c62240b18cfe55 100644 --- a/Doc/reference/introduction.rst +++ b/Doc/reference/introduction.rst @@ -145,15 +145,23 @@ The definition to the right of the colon uses the following syntax elements: * ``e?``: A question mark has exactly the same meaning as square brackets: the preceding item is optional. * ``(e)``: Parentheses are used for grouping. + +The following notation is only used in +:ref:`lexical definitions `. + * ``"a"..."z"``: Two literal characters separated by three dots mean a choice of any single character in the given (inclusive) range of ASCII characters. - This notation is only used in - :ref:`lexical definitions `. * ``<...>``: A phrase between angular brackets gives an informal description of the matched symbol (for example, ````), or an abbreviation that is defined in nearby text (for example, ````). - This notation is only used in - :ref:`lexical definitions `. + +.. _lexical-lookaheads: + +Some definitions also use *lookaheads*, which indicate that an element +must (or must not) match at a given position, but without consuming any input: + +* ``&e``: a positive lookahead (that is, ``e`` is required to match) +* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match) The unary operators (``*``, ``+``, ``?``) bind as tightly as possible; the vertical bar (``|``) binds most loosely. diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 6f3d90f89b98d3..67cc9bd8fc7bac 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -39,7 +39,8 @@ The end of a logical line is represented by the token :data:`~token.NEWLINE`. Statements cannot cross logical line boundaries except where :data:`!NEWLINE` is allowed by the syntax (e.g., between statements in compound statements). A logical line is constructed from one or more *physical lines* by following -the explicit or implicit *line joining* rules. +the :ref:`explicit ` or :ref:`implicit ` +*line joining* rules. .. _physical-lines: @@ -47,17 +48,28 @@ the explicit or implicit *line joining* rules. Physical lines -------------- -A physical line is a sequence of characters terminated by an end-of-line -sequence. In source files and strings, any of the standard platform line -termination sequences can be used - the Unix form using ASCII LF (linefeed), -the Windows form using the ASCII sequence CR LF (return followed by linefeed), -or the old Macintosh form using the ASCII CR (return) character. All of these -forms can be used equally, regardless of platform. The end of input also serves -as an implicit terminator for the final physical line. +A physical line is a sequence of characters terminated by one the following +end-of-line sequences: -When embedding Python, source code strings should be passed to Python APIs using -the standard C conventions for newline characters (the ``\n`` character, -representing ASCII LF, is the line terminator). +* the Unix form using ASCII LF (linefeed), +* the Windows form using the ASCII sequence CR LF (return followed by linefeed), +* the old Macintosh form using the ASCII CR (return) character. + +Regardless of platform, each of these sequences is replaced by a single +ASCII LF (linefeed) character. +(This is done even inside :ref:`string literals `.) +Each line can use any of the sequences; they do not need to be consistent +within a file. + +The end of input also serves as an implicit terminator for the final +physical line. + +Formally: + +.. grammar-snippet:: + :group: python-grammar + + newline: | | .. _comments: @@ -484,6 +496,13 @@ Literals Literals are notations for constant values of some built-in types. +In terms of lexical analysis, Python has :ref:`string, bytes ` +and :ref:`numeric ` literals. + +Other “literals” are lexically denoted using :ref:`keywords ` +(``None``, ``True``, ``False``) and the special +:ref:`ellipsis token ` (``...``): + .. index:: string literal, bytes literal, ASCII single: ' (single quote); string literal @@ -491,7 +510,7 @@ Literals are notations for constant values of some built-in types. .. _strings: String and Bytes literals -------------------------- +========================= String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: @@ -635,41 +654,26 @@ They may not be combined with ``'b'``, ``'u'``, or each other. String literals, except "F-strings" and "T-strings", are described by the -following lexical definitions: +following lexical definitions. + +These definitions use :ref:`negative lookaheads ` (``!``) +to indicate that an ending quote ends the literal. .. grammar-snippet:: :group: python-grammar - STRING: stringliteral | bytesliteral | fstring | tstring - - stringliteral: [`stringprefix`](`stringcontent`) - stringprefix: <("r" | "u"), case-insensitive> - stringcontent: `quote` `stringitem`* - quote: "'" | '"' | "'''" | '"""' + STRING: [`stringprefix`] (`stringcontent`) + stringprefix: <("r" | "u" | "b" | "br" | "rb"), case-insensitive> + stringcontent: + | "'" ( !"'" `stringitem`)* "'" + | '"' ( !'"' `stringitem`)* '"' + | "'''" ( !"'''" `longstringitem`)* "'''" + | '"""' ( !'"""' `longstringitem`)* '"""' stringitem: `stringchar` | `stringescapeseq` - stringchar: + stringchar: + longstringitem: `stringitem` | newline stringescapeseq: "\" -``stringchar`` can not include: - -- the backslash, ``\``; -- in triple-quoted strings (quoted by ``'''`` or ``"""``), the newline; -- the quote character. - - -.. grammar-snippet:: - :group: python-grammar - - bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) - bytesprefix: <("b" | "br" | "rb" ), case-insensitive> - shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' - longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' - shortbytesitem: `shortbyteschar` | `bytesescapeseq` - longbytesitem: `longbyteschar` | `bytesescapeseq` - shortbyteschar: - longbyteschar: - bytesescapeseq: "\" - Note that as in all lexical definitions, whitespace is significant. The prefix, if any, must be followed immediately by the quoted string content. @@ -692,7 +696,7 @@ The prefix, if any, must be followed immediately by the quoted string content. .. _escape-sequences: Escape sequences -^^^^^^^^^^^^^^^^ +---------------- Unless an ``'r'`` or ``'R'`` prefix is present, escape sequences in string and bytes literals are interpreted according to rules similar to those used by @@ -985,7 +989,7 @@ and :meth:`str.format`, which uses a related format string mechanism. .. _numbers: Numeric literals ----------------- +================ .. index:: number, numeric literal, integer literal floating-point literal, hexadecimal literal @@ -1241,14 +1245,26 @@ The following tokens serve as delimiters in the grammar: ( ) [ ] { } , : ! . ; @ = + +The period can also occur in floating-point and imaginary literals. + +.. _lexical-ellipsis: + +A sequence of three periods has a special meaning as an +:py:data:`Ellipsis` literal: + +.. code-block:: none + + ... + +The following *augmented assignment operators* serve +lexically as delimiters, but also perform an operation: + +.. code-block:: none + -> += -= *= /= //= %= @= &= |= ^= >>= <<= **= -The period can also occur in floating-point and imaginary literals. A sequence -of three periods has a special meaning as an ellipsis literal. The second half -of the list, the augmented assignment operators, serve lexically as delimiters, -but also perform an operation. - The following printing ASCII characters have special meaning as part of other tokens or are otherwise significant to the lexical analyzer: From 86bf94b0f4cc9f9eaa63728610d7bb71fc4f3107 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 18 Jun 2025 18:05:31 +0200 Subject: [PATCH 4/8] More WIP --- Doc/reference/lexical_analysis.rst | 424 +++++++++++++++++------------ 1 file changed, 251 insertions(+), 173 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 67cc9bd8fc7bac..36abfa31c093c9 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -501,7 +501,7 @@ and :ref:`numeric ` literals. Other “literals” are lexically denoted using :ref:`keywords ` (``None``, ``True``, ``False``) and the special -:ref:`ellipsis token ` (``...``): +:ref:`ellipsis token ` (``...``). .. index:: string literal, bytes literal, ASCII @@ -519,7 +519,7 @@ quotes (``"``). For example: writing, highlighted strings don't look good when there's no code surrounding them. -.. code-block:: text +.. code-block:: python "spam" 'eggs' @@ -528,7 +528,7 @@ The quote used to start the literal also terminates it, so a string literal can only contain the other quote (except with escape sequences, see below). For example: -.. code-block:: text +.. code-block:: python 'Say "Hello", please.' "Don't do that!" @@ -536,6 +536,21 @@ For example: Except for this limitation, the choice of quote character (``'`` or ``"``) does not affect how the literal is parsed. +Inside a string literal, the backslash (``\``) character introduces an +:dfn:`escape sequence`, which has special meaning depending on the character +after the backslash. +For example, ``\"`` denotes the double quote character, and does *not* end +the string: + +.. code-block:: python + + >>> print("Say \"Hello\" to everyone!") + Say "Hello" to everyone! + +See :ref:`escape sequences ` below for a full list of such +sequences, and more details. + + .. index:: triple-quoted string single: """; string literal single: '''; string literal @@ -545,32 +560,20 @@ Triple-quoted strings Strings can also be enclosed in matching groups of three single or double quotes. -These are generally referred to as :dfn:`triple-quoted strings`. +These are generally referred to as :dfn:`triple-quoted strings`:: -In triple-quoted literals, unescaped newlines and quotes are allowed (and are -retained), except that three unescaped quotes in a row terminate the literal. -(Here, a *quote* is the character used to open the literal, that is, -either ``'`` or ``"``.) + """This is a triple-quoted string.""" -For example: +In triple-quoted literals, unescaped quotes are allowed (and are +retained), except that three unescaped quotes in a row terminate the literal, +if they are of the same kind (``'`` or ``"``) used at the start:: -.. code-block:: text + """This string has "quotes" inside.""" - """This is a triple-quoted string with "quotes" inside.""" +Unescaped newlines are also allowed and retained:: - '''Another triple-quoted string. This one continues - on the next line.''' - -Escape sequences ----------------- - -Inside a string literal, the backslash (``\``) character introduces an -:dfn:`escape sequence`, which has special meaning depending on the character -after the backslash. -For example, ``\n`` denotes the 'newline' character, rather the two characters -``\`` and ``n``. -See :ref:`escape sequences ` below for a full list of such -sequences, and more details. + '''This triple-quoted string + continues on the next line.''' .. index:: @@ -580,70 +583,28 @@ sequences, and more details. String prefixes --------------- -String literals can have an optional :dfn:`prefix` that influences how the literal -is parsed, for example: +String literals can have an optional :dfn:`prefix` that influences how the +content of the literal is parsed, for example: .. code-block:: python b"data" f'{result=}' -* ``r``: Raw string -* ``f``: "F-string" -* ``t``: "T-string" -* ``b``: Byte literal +The allowed prefixes are: + +* ``b``: :ref:`Bytes literal ` +* ``r``: :ref:`Raw string ` +* ``f``: :ref:`Formatted string literal ` ("f-string") +* ``t``: :ref:`Template string literal ` ("t-string") * ``u``: No effect (allowed for backwards compatibility) +See the linked sections for details on each type. + Prefixes are case-insensitive (for example, ``B`` works the same as ``b``). The ``r`` prefix can be combined with ``f``, ``t`` or ``b``, so ``fr``, ``rf``, ``tr``, ``rt``, ``br`` and ``rb`` are also valid prefixes. - -.. index:: - single: b'; bytes literal - single: b"; bytes literal - -:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an -instance of the :class:`bytes` type instead of the :class:`str` type. -They may only contain ASCII characters; bytes with a numeric value of 128 -or greater must be expressed with escape sequences. -Similarly, a zero byte must be expressed using an escape sequence. - - -.. index:: - single: r'; raw string literal - single: r"; raw string literal - -Both string and bytes literals may optionally be prefixed with a letter ``'r'`` -or ``'R'``; such constructs are called :dfn:`raw string literals` -and :dfn:`raw bytes literals` respectively and treat backslashes as -literal characters. -As a result, in raw string literals, :ref:`escape sequences ` -escapes are not treated specially. - -Even in a raw literal, quotes can be escaped with a backslash, but the -backslash remains in the result; for example, ``r"\""`` is a valid string -literal consisting of two characters: a backslash and a double quote; ``r"\"`` -is not a valid string literal (even a raw string cannot end in an odd number of -backslashes). Specifically, *a raw literal cannot end in a single backslash* -(since the backslash would escape the following quote character). Note also -that a single backslash followed by a newline is interpreted as those two -characters as part of the literal, *not* as a line continuation. - - -.. index:: - single: f'; formatted string literal - single: f"; formatted string literal - -A string literal with ``'f'`` or ``'F'`` in its prefix is a -:dfn:`formatted string literal`; see :ref:`f-strings`. -Similarly, string literal with ``'t'`` or ``'T'`` in its prefix is a -:dfn:`template string literal`; see :ref:`t-strings`. - -The ``'f'`` or ``t`` may be combined with ``'r'`` to create a -:dfn:`raw formatted string` or :dfn:`raw template string`. -They may not be combined with ``'b'``, ``'u'``, or each other. - .. versionadded:: 3.3 The ``'rb'`` prefix of raw bytes literals has been added as a synonym of ``'br'``. @@ -653,7 +614,11 @@ They may not be combined with ``'b'``, ``'u'``, or each other. See :pep:`414` for more information. -String literals, except "F-strings" and "T-strings", are described by the +Formal grammar +-------------- + +String literals, except :ref:`"F-strings" ` and +:ref:`"T-strings" `, are described by the following lexical definitions. These definitions use :ref:`negative lookaheads ` (``!``) @@ -675,23 +640,8 @@ to indicate that an ending quote ends the literal. stringescapeseq: "\" Note that as in all lexical definitions, whitespace is significant. -The prefix, if any, must be followed immediately by the quoted string content. - - -.. index:: physical line, escape sequence, Standard C, C - single: \ (backslash); escape sequence - single: \\; escape sequence - single: \a; escape sequence - single: \b; escape sequence - single: \f; escape sequence - single: \n; escape sequence - single: \r; escape sequence - single: \t; escape sequence - single: \v; escape sequence - single: \x; escape sequence - single: \N; escape sequence - single: \u; escape sequence - single: \U; escape sequence +In particular, the prefix (if any) must be immediately followed by the starting +quote. .. _escape-sequences: @@ -702,55 +652,50 @@ Unless an ``'r'`` or ``'R'`` prefix is present, escape sequences in string and bytes literals are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are: -+-------------------------+---------------------------------+-------+ -| Escape Sequence | Meaning | Notes | -+=========================+=================================+=======+ -| ``\``\ | Backslash and newline ignored | \(1) | -+-------------------------+---------------------------------+-------+ -| ``\\`` | Backslash (``\``) | | -+-------------------------+---------------------------------+-------+ -| ``\'`` | Single quote (``'``) | | -+-------------------------+---------------------------------+-------+ -| ``\"`` | Double quote (``"``) | | -+-------------------------+---------------------------------+-------+ -| ``\a`` | ASCII Bell (BEL) | | -+-------------------------+---------------------------------+-------+ -| ``\b`` | ASCII Backspace (BS) | | -+-------------------------+---------------------------------+-------+ -| ``\f`` | ASCII Formfeed (FF) | | -+-------------------------+---------------------------------+-------+ -| ``\n`` | ASCII Linefeed (LF) | | -+-------------------------+---------------------------------+-------+ -| ``\r`` | ASCII Carriage Return (CR) | | -+-------------------------+---------------------------------+-------+ -| ``\t`` | ASCII Horizontal Tab (TAB) | | -+-------------------------+---------------------------------+-------+ -| ``\v`` | ASCII Vertical Tab (VT) | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\\\{ooo}` | Character with octal value | (2,4) | -| | *ooo* | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\x{hh}` | Character with hex value *hh* | (3,4) | -+-------------------------+---------------------------------+-------+ - -Escape sequences only recognized in string literals are: - -+-------------------------+---------------------------------+-------+ -| Escape Sequence | Meaning | Notes | -+=========================+=================================+=======+ -| :samp:`\\N\\{{name}\\}` | Character named *name* in the | \(5) | -| | Unicode database | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\u{xxxx}` | Character with 16-bit hex value | \(6) | -| | *xxxx* | | -+-------------------------+---------------------------------+-------+ -| :samp:`\\U{xxxxxxxx}` | Character with 32-bit hex value | \(7) | -| | *xxxxxxxx* | | -+-------------------------+---------------------------------+-------+ - -Notes: - -(1) +.. list-table:: + :widths: auto + :header-rows: 1 + + * * Escape Sequence + * Meaning + * * ``\``\ + * :ref:`string-escape-ignore` + * * ``\\`` + * :ref:`Backslash ` + * * ``\'`` + * :ref:`Single quote ` + * * ``\"`` + * :ref:`Double quote ` + * * ``\a`` + * ASCII Bell (BEL) + * * ``\b`` + * ASCII Backspace (BS) + * * ``\f`` + * ASCII Formfeed (FF) + * * ``\n`` + * ASCII Linefeed (LF) + * * ``\r`` + * ASCII Carriage Return (CR) + * * ``\t`` + * ASCII Horizontal Tab (TAB) + * * ``\v`` + * ASCII Vertical Tab (VT) + * * :samp:`\\\\{ooo}` + * :ref:`string-escape-oct` + * * :samp:`\\x{hh}` + * :ref:`string-escape-hex` + * * :samp:`\\N\\{{name}\\}` + * :ref:`string-escape-named` + * * :samp:`\\u{xxxx}` + * :ref:`Hexadecimal Unicode character ` + * * :samp:`\\U{xxxxxxxx}` + * :ref:`Hexadecimal Unicode character ` + +.. _string-escape-ignore: + +Ignored end of line +^^^^^^^^^^^^^^^^^^^ + A backslash can be added at the end of a line to ignore the newline:: >>> 'This string will not include \ @@ -760,9 +705,39 @@ Notes: The same result can be achieved using :ref:`triple-quoted strings `, or parentheses and :ref:`string literal concatenation `. +.. _string-escape-escaped-char: + +Escaped characters +^^^^^^^^^^^^^^^^^^ -(2) - As in Standard C, up to three octal digits (0 through 7) are accepted. + To include a backslash in a non-:ref:`raw ` Python string + literal, it must be doubled. The ``\\`` escape sequence denotes a single + backslash character:: + + >>> print('C:\\Program Files') + C:\Program Files + + Similarly, the ``\'`` and ``\"`` sequences denote the single and double + quote character, respectively:: + + >>> print('\' and \"') + ' and " + +.. _string-escape-oct: + +Octal character +^^^^^^^^^^^^^^^ + + The sequence :samp:`\\\\{ooo}` denotes a *character* with the octal (base 8) + value *ooo*:: + + >>> '\120' + 'P' + + Up to three octal digits (0 through 7) are accepted. + + In a bytes literal, *character* means a *byte* with the given value. + In a string literal, it means a Unicode character with the given value. .. versionchanged:: 3.11 Octal escapes with value larger than ``0o377`` (255) produce a @@ -770,42 +745,147 @@ Notes: .. versionchanged:: 3.12 Octal escapes with value larger than ``0o377`` (255) produce a - :exc:`SyntaxWarning`. In a future Python version they will be eventually - a :exc:`SyntaxError`. + :exc:`SyntaxWarning`. + In a future Python version they will raise a :exc:`SyntaxError`. + +.. _string-escape-hex: + +Hexadecimal character +^^^^^^^^^^^^^^^^^^^^^ + + The sequence :samp:`\\x{hh}` denotes a *character* with the hex (base 16) + value *hh*:: + + >>> '\x50' + 'P' + + Unlike in Standard C, exactly two hex digits are required. + + In a bytes literal, *character* means a *byte* with the given value. + In a string literal, it means a Unicode character with the given value. + +.. _string-escape-named: + +Named Unicode character +^^^^^^^^^^^^^^^^^^^^^^^ + + The sequence :samp:`\\N\\{{name}\\}` denotes a Unicode character + with the given *name*:: + + >>> '\N{LATIN CAPITAL LETTER P}' + 'P' + >>> '\N{SNAKE}' + '🐍' + + This sequence cannot appear in :ref:`bytes literals `. + + .. versionchanged:: 3.3 + Support for `name aliases `__ + has been added. -(3) - Unlike in Standard C, exactly two hex digits are required. +.. _string-escape-long-hex: -(4) - In a bytes literal, hexadecimal and octal escapes denote the byte with the - given value. In a string literal, these escapes denote a Unicode character - with the given value. +Hexadecimal Unicode characters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -(5) - .. versionchanged:: 3.3 - Support for name aliases [#]_ has been added. + These sequences :samp:`\\u{xxxx}` and :samp:`\\U{xxxxxxxx}` denote the + Unicode character with the given hex (base 16) value. + Exactly four digits are required for ``\u``; exactly eight digits are + required for ``\U``. + The latter can encode any Unicode character. -(6) - Exactly four hex digits are required. + .. code-block:: python -(7) - Any Unicode character can be encoded this way. Exactly eight hex digits - are required. + >>> '\u1234' + 'ሴ' + >>> '\U0001f40d' + '🐍' + + These sequences cannot appear in :ref:`bytes literals `. .. index:: unrecognized escape sequence -Unlike Standard C, all unrecognized escape sequences are left in the string -unchanged, i.e., *the backslash is left in the result*. +Unrecognized escape sequences +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Unlike in Standard C, all unrecognized escape sequences are left in the string +unchanged, that is, *the backslash is left in the result*:: + + >>> print('\q') + \q + >>> list('\q') + ['\\', 'q'] + Note that for bytes literals, the escape sequences only recognized in string -literals fall into the category of unrecognized escapes. +literals (``\N...``, ``\u...``, ``\U...``) fall into the category of +unrecognized escapes. .. versionchanged:: 3.6 Unrecognized escape sequences produce a :exc:`DeprecationWarning`. .. versionchanged:: 3.12 - Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future - Python version they will be eventually a :exc:`SyntaxError`. + Unrecognized escape sequences produce a :exc:`SyntaxWarning`. + In a future Python version they will raise a :exc:`SyntaxError`. + + +.. index:: + single: b'; bytes literal + single: b"; bytes literal + + +.. _bytes-literal: + +Bytes literals +-------------- + +:dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an +instance of the :class:`bytes` type instead of the :class:`str` type. +They may only contain ASCII characters; bytes with a numeric value of 128 +or greater must be expressed with escape sequences. +Similarly, a zero byte must be expressed using an escape sequence. + + +.. index:: + single: r'; raw string literal + single: r"; raw string literal + +.. _raw-strings: + +Raw string literals +------------------- + +Both string and bytes literals may optionally be prefixed with a letter ``'r'`` +or ``'R'``; such constructs are called :dfn:`raw string literals` +and :dfn:`raw bytes literals` respectively and treat backslashes as +literal characters. +As a result, in raw string literals, :ref:`escape sequences ` +escapes are not treated specially. + +Even in a raw literal, quotes can be escaped with a backslash, but the +backslash remains in the result; for example, ``r"\""`` is a valid string +literal consisting of two characters: a backslash and a double quote; ``r"\"`` +is not a valid string literal (even a raw string cannot end in an odd number of +backslashes). Specifically, *a raw literal cannot end in a single backslash* +(since the backslash would escape the following quote character). Note also +that a single backslash followed by a newline is interpreted as those two +characters as part of the literal, *not* as a line continuation. + + +.. index:: physical line, escape sequence, Standard C, C + single: \ (backslash); escape sequence + single: \\; escape sequence + single: \a; escape sequence + single: \b; escape sequence + single: \f; escape sequence + single: \n; escape sequence + single: \r; escape sequence + single: \t; escape sequence + single: \v; escape sequence + single: \x; escape sequence + single: \N; escape sequence + single: \u; escape sequence + single: \U; escape sequence .. index:: @@ -815,6 +895,8 @@ literals fall into the category of unrecognized escapes. single: string; interpolated literal single: f-string single: fstring + single: f'; formatted string literal + single: f"; formatted string literal single: {} (curly brackets); in formatted string literal single: ! (exclamation); in formatted string literal single: : (colon); in formatted string literal @@ -1022,7 +1104,7 @@ actually an expression composed of the unary operator '``-``' and the literal .. _integers: Integer literals -^^^^^^^^^^^^^^^^ +---------------- Integer literals denote whole numbers. For example:: @@ -1095,7 +1177,7 @@ Formally, integer literals are described by the following lexical definitions: .. _floating: Floating-point literals -^^^^^^^^^^^^^^^^^^^^^^^ +----------------------- Floating-point (float) literals, such as ``3.14`` or ``1.5``, denote :ref:`approximations of real numbers `. @@ -1157,7 +1239,7 @@ lexical definitions: .. _imaginary: Imaginary literals -^^^^^^^^^^^^^^^^^^ +------------------ Python has :ref:`complex number ` objects, but no complex literals. @@ -1279,7 +1361,3 @@ occurrence outside string literals and comments is an unconditional error: $ ? ` - -.. rubric:: Footnotes - -.. [#] https://www.unicode.org/Public/16.0.0/ucd/NameAliases.txt From faf05a192ed7ec80ab26e803544ce9585b59d583 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:26:37 +0200 Subject: [PATCH 5/8] Byte strings, raw strings; f-string stub --- Doc/reference/lexical_analysis.rst | 65 +++++++++++++++++++++--------- 1 file changed, 46 insertions(+), 19 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 36abfa31c093c9..2c6ae9a16d0d08 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -643,6 +643,21 @@ Note that as in all lexical definitions, whitespace is significant. In particular, the prefix (if any) must be immediately followed by the starting quote. +.. index:: physical line, escape sequence, Standard C, C + single: \ (backslash); escape sequence + single: \\; escape sequence + single: \a; escape sequence + single: \b; escape sequence + single: \f; escape sequence + single: \n; escape sequence + single: \r; escape sequence + single: \t; escape sequence + single: \v; escape sequence + single: \x; escape sequence + single: \N; escape sequence + single: \u; escape sequence + single: \U; escape sequence + .. _escape-sequences: Escape sequences @@ -842,8 +857,18 @@ Bytes literals :dfn:`Bytes literals` are always prefixed with ``'b'`` or ``'B'``; they produce an instance of the :class:`bytes` type instead of the :class:`str` type. They may only contain ASCII characters; bytes with a numeric value of 128 -or greater must be expressed with escape sequences. -Similarly, a zero byte must be expressed using an escape sequence. +or greater must be expressed with escape sequences (typically +:ref:`string-escape-hex` or :ref:`string-escape-oct`): + +.. code-block:: python + + >>> b'\x89PNG\r\n\x1a\n' + b'\x89PNG\r\n\x1a\n' + >>> list(b'\x89PNG\r\n\x1a\n') + [137, 80, 78, 71, 13, 10, 26, 10] + +Similarly, a zero byte must be expressed using an escape sequence (typically +``\0`` or ``\x00``). .. index:: @@ -860,7 +885,12 @@ or ``'R'``; such constructs are called :dfn:`raw string literals` and :dfn:`raw bytes literals` respectively and treat backslashes as literal characters. As a result, in raw string literals, :ref:`escape sequences ` -escapes are not treated specially. +are not treated specially: + +.. code-block:: python + + >>> r'\d{4}-\d{2}-\d{2}' + '\\d{4}-\\d{2}-\\d{2}' Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, ``r"\""`` is a valid string @@ -872,22 +902,6 @@ that a single backslash followed by a newline is interpreted as those two characters as part of the literal, *not* as a line continuation. -.. index:: physical line, escape sequence, Standard C, C - single: \ (backslash); escape sequence - single: \\; escape sequence - single: \a; escape sequence - single: \b; escape sequence - single: \f; escape sequence - single: \n; escape sequence - single: \r; escape sequence - single: \t; escape sequence - single: \v; escape sequence - single: \x; escape sequence - single: \N; escape sequence - single: \u; escape sequence - single: \U; escape sequence - - .. index:: single: formatted string literal single: interpolated string literal @@ -1067,6 +1081,19 @@ include expressions. See also :pep:`498` for the proposal that added formatted string literals, and :meth:`str.format`, which uses a related format string mechanism. +.. _t-strings: +.. _template-string-literals: + +t-strings +--------- + +A :dfn:`template string literal` or :dfn:`t-string` is a string literal that +is prefixed with ``'t'`` or ``'T'``. +These strings have internal structure similar to :ref:`f-strings`, +but are evaluated as Template objects instead of strings. + +.. versionadded:: 3.14 + .. _numbers: From 687fe5830318ca89a5541703bae3e62b3c8a7b5e Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:38:09 +0200 Subject: [PATCH 6/8] Remove outdated comment --- Doc/reference/lexical_analysis.rst | 4 ---- 1 file changed, 4 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 2c6ae9a16d0d08..e3d0bab8942ced 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -515,10 +515,6 @@ String and Bytes literals String literals are text enclosed in single quotes (``'``) or double quotes (``"``). For example: -.. This is Python code, but we turn off highlighting because as of this - writing, highlighted strings don't look good when there's no code - surrounding them. - .. code-block:: python "spam" From 9f9d29ccab8a5c25aa9433a90bd03d2a5521c36b Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:50:18 +0200 Subject: [PATCH 7/8] Fix ReST errors --- Doc/reference/expressions.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 743d43b1c9c1b1..c1f046388c3d1b 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -160,7 +160,7 @@ value. .. _string-concatenation: String literal concatenation -............................ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Multiple adjacent string or bytes literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same @@ -172,7 +172,7 @@ Formally: .. grammar-snippet:: :group: python-grammar - strings: ( `STRING` | `fstring` | `tstring`)+ + strings: ( `STRING` | fstring | tstring)+ Note that this feature is defined at the syntactical level, so it only works with literals. From 11e37317c24523f187630e137537caab218af2d4 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 25 Jun 2025 16:51:31 +0200 Subject: [PATCH 8/8] Update Doc/reference/expressions.rst Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com> --- Doc/reference/expressions.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index c1f046388c3d1b..0803fadd0eeb37 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -143,7 +143,7 @@ integer, floating-point number, complex number) with the given value. The value may be approximated in the case of floating-point and imaginary (complex) literals. See section :ref:`literals` for details. -Seee section :ref:`string-concatenation` for details on ``strings``. +See section :ref:`string-concatenation` for details on ``strings``. .. index::