@@ -39,25 +39,37 @@ The end of a logical line is represented by the token :data:`~token.NEWLINE`.
3939Statements cannot cross logical line boundaries except where :data: `!NEWLINE `
4040is allowed by the syntax (e.g., between statements in compound statements).
4141A logical line is constructed from one or more *physical lines * by following
42- the explicit or implicit *line joining * rules.
42+ the :ref: `explicit <explicit-joining >` or :ref: `implicit <implicit-joining >`
43+ *line joining * rules.
4344
4445
4546.. _physical-lines :
4647
4748Physical lines
4849--------------
4950
50- A physical line is a sequence of characters terminated by an end-of-line
51- sequence. In source files and strings, any of the standard platform line
52- termination sequences can be used - the Unix form using ASCII LF (linefeed),
53- the Windows form using the ASCII sequence CR LF (return followed by linefeed),
54- or the old Macintosh form using the ASCII CR (return) character. All of these
55- forms can be used equally, regardless of platform. The end of input also serves
56- as an implicit terminator for the final physical line.
51+ A physical line is a sequence of characters terminated by one the following
52+ end-of-line sequences:
5753
58- When embedding Python, source code strings should be passed to Python APIs using
59- the standard C conventions for newline characters (the ``\n `` character,
60- representing ASCII LF, is the line terminator).
54+ * the Unix form using ASCII LF (linefeed),
55+ * the Windows form using the ASCII sequence CR LF (return followed by linefeed),
56+ * the old Macintosh form using the ASCII CR (return) character.
57+
58+ Regardless of platform, each of these sequences is replaced by a single
59+ ASCII LF (linefeed) character.
60+ (This is done even inside :ref: `string literals <strings >`.)
61+ Each line can use any of the sequences; they do not need to be consistent
62+ within a file.
63+
64+ The end of input also serves as an implicit terminator for the final
65+ physical line.
66+
67+ Formally:
68+
69+ .. grammar-snippet ::
70+ :group: python-grammar
71+
72+ newline: <ASCII LF> | <ASCII CR> <ASCII LF> | <ASCII CR>
6173
6274
6375.. _comments :
@@ -484,14 +496,21 @@ Literals
484496
485497Literals are notations for constant values of some built-in types.
486498
499+ In terms of lexical analysis, Python has :ref: `string, bytes <strings >`
500+ and :ref: `numeric <numbers >` literals.
501+
502+ Other “literals” are lexically denoted using :ref: `keywords <keywords >`
503+ (``None ``, ``True ``, ``False ``) and the special
504+ :ref: `ellipsis token <lexical-ellipsis >` (``... ``):
505+
487506
488507.. index :: string literal, bytes literal, ASCII
489508 single: ' (single quote); string literal
490509 single: " (double quote); string literal
491510.. _strings :
492511
493512String and Bytes literals
494- -------------------------
513+ =========================
495514
496515String literals are text enclosed in single quotes (``' ``) or double
497516quotes (``" ``). For example:
@@ -635,41 +654,26 @@ They may not be combined with ``'b'``, ``'u'``, or each other.
635654
636655
637656String literals, except "F-strings" and "T-strings", are described by the
638- following lexical definitions:
657+ following lexical definitions.
658+
659+ These definitions use :ref: `negative lookaheads <lexical-lookaheads >` (``! ``)
660+ to indicate that an ending quote ends the literal.
639661
640662.. grammar-snippet ::
641663 :group: python-grammar
642664
643- STRING: stringliteral | bytesliteral | fstring | tstring
644-
645- stringliteral: [`stringprefix `](`stringcontent `)
646- stringprefix: <("r" | "u"), case-insensitive>
647- stringcontent: `quote ` `stringitem`* <matching `quote `>
648- quote: "'" | '"' | "'''" | '"""'
665+ STRING: [`stringprefix `] (`stringcontent `)
666+ stringprefix: <("r" | "u" | "b" | "br" | "rb"), case-insensitive>
667+ stringcontent:
668+ | "'" ( !"'" `stringitem`)* "'"
669+ | '"' ( !'"' `stringitem`)* '"'
670+ | "'''" ( !"'''" `longstringitem`)* "'''"
671+ | '"""' ( !'"""' `longstringitem`)* '"""'
649672 stringitem: `stringchar ` | `stringescapeseq `
650- stringchar: <any `source_character `, except as listed below>
673+ stringchar: <any `source_character `, except backslash and newline>
674+ longstringitem: `stringitem ` | newline
651675 stringescapeseq: "\" <any `source_character `>
652676
653- ``stringchar `` can not include:
654-
655- - the backslash, ``\ ``;
656- - in triple-quoted strings (quoted by ``''' `` or ``""" ``), the newline;
657- - the quote character.
658-
659-
660- .. grammar-snippet ::
661- :group: python-grammar
662-
663- bytesliteral: `bytesprefix`(`shortbytes ` | `longbytes `)
664- bytesprefix: <("b" | "br" | "rb" ), case-insensitive>
665- shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"'
666- longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""'
667- shortbytesitem: `shortbyteschar ` | `bytesescapeseq `
668- longbytesitem: `longbyteschar ` | `bytesescapeseq `
669- shortbyteschar: <any ASCII `source_character ` except "\" or newline or the quote>
670- longbyteschar: <any ASCII `source_character ` except "\" >
671- bytesescapeseq: "\" <any ASCII `source_character `>
672-
673677Note that as in all lexical definitions, whitespace is significant.
674678The prefix, if any, must be followed immediately by the quoted string content.
675679
@@ -692,7 +696,7 @@ The prefix, if any, must be followed immediately by the quoted string content.
692696.. _escape-sequences :
693697
694698Escape sequences
695- ^^^^^^^^^^^^^^^^
699+ ----------------
696700
697701Unless an ``'r' `` or ``'R' `` prefix is present, escape sequences in string and
698702bytes literals are interpreted according to rules similar to those used by
@@ -985,7 +989,7 @@ and :meth:`str.format`, which uses a related format string mechanism.
985989.. _numbers :
986990
987991Numeric literals
988- ----------------
992+ ================
989993
990994.. index :: number, numeric literal, integer literal
991995 floating-point literal, hexadecimal literal
@@ -1241,14 +1245,26 @@ The following tokens serve as delimiters in the grammar:
12411245
12421246 ( ) [ ] { }
12431247 , : ! . ; @ =
1248+
1249+ The period can also occur in floating-point and imaginary literals.
1250+
1251+ .. _lexical-ellipsis :
1252+
1253+ A sequence of three periods has a special meaning as an
1254+ :py:data: `Ellipsis ` literal:
1255+
1256+ .. code-block :: none
1257+
1258+ ...
1259+
1260+ The following *augmented assignment operators * serve
1261+ lexically as delimiters, but also perform an operation:
1262+
1263+ .. code-block :: none
1264+
12441265 -> += -= *= /= //= %=
12451266 @= &= |= ^= >>= <<= **=
12461267
1247- The period can also occur in floating-point and imaginary literals. A sequence
1248- of three periods has a special meaning as an ellipsis literal. The second half
1249- of the list, the augmented assignment operators, serve lexically as delimiters,
1250- but also perform an operation.
1251-
12521268 The following printing ASCII characters have special meaning as part of other
12531269tokens or are otherwise significant to the lexical analyzer:
12541270
0 commit comments