diff --git a/peps/pep-0798.rst b/peps/pep-0798.rst index 3162d56fc94..7fea147628a 100644 --- a/peps/pep-0798.rst +++ b/peps/pep-0798.rst @@ -119,6 +119,11 @@ existing syntax ``[x for it in its for x in it]`` is one that students often get wrong, the natural impulse for many students being to reverse the order of the ``for`` clauses. +Additionally, the comment section of a `Reddit post +`__ +following the publication of this PEP shows substantial support for the +proposal and further suggests that the syntax proposed here is legible, +intuitive, and useful. Specification ============= @@ -126,7 +131,7 @@ Specification Syntax ------ -The necessary grammatical changes are allowing the expression in list/set +The grammar should be changed to allow the expression in list/set comprehensions and generator expressions to be preceded by a ``*``, and allowing an alternative form of dictionary comprehension in which a double-starred expression can be used in place of a ``key: value`` pair. @@ -204,29 +209,42 @@ respectively:: for x in dicts: new_dict.update(expr) +.. _pep798-genexpsemantics: Semantics: Generator Expressions -------------------------------- -A generator expression ``(*expr for x in it)`` forms a generator producing -values from the concatenation of the iterables given by the expressions. -Specifically, the behavior is defined to be equivalent to the following -generator:: +Generator expressions using the unpacking syntax should form new generators +producing values from the concatenation of the iterables given by the +expressions. Specifically, the behavior is defined to be equivalent to the +following:: + # equivalent to g = (*expr for x in it) def generator(): for x in it: yield from expr + g = generator() + Since ``yield from`` is not allowed inside of async generators (see the section of :pep:`525` on Asynchronous ``yield from``), the equivalent for ``(*expr async for x in ait())`` is more like the following (though of course this new form should not define or reference the looping variable ``i``):: + # equivalent to g = (*expr async for x in ait()) async def generator(): async for x in ait(): for i in expr: yield i + g = generator() + +The specifics of these semantics should be revisited in the future, +particularly if async generators receive support for ``yield from`` (in which +case the async variant may wish to be changed to make use of ``yield from`` +instead of an explicit loop). See :ref:`pep798-alternativegenexpsemantics` for +more discussion. + Interaction with Assignment Expressions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -330,7 +348,8 @@ cases: * The phrasing of some other existing error messages should similarly be adjusted to account for the presence of the new syntax, and/or to clarify ambiguous or confusing cases relating to unpacking more generally - (particularly those mentioned in :ref:`pep798-moregeneral`), for example:: + (particularly the cases mentioned in :ref:`pep798-moregeneral`), for + example:: >>> [*x if x else y] File "", line 1 @@ -362,9 +381,9 @@ cases: Reference Implementation ======================== -A `reference implementation `_ -is available, which implements this functionality, including draft documentation and -additional test cases. +The `reference implementation `_ +implements this functionality, including draft documentation and additional +test cases. Backwards Compatibility ======================= @@ -377,6 +396,15 @@ in comprehensions would raise a ``SyntaxError``, or that relied on the particular phrasing of any of the old error messages being replaced, which we expect to be rare. +One related concern is that a hypothetical future decision to change the +semantics of async generator expressions to make use of ``yield from`` during +unpacking (delegating to generators that are being unpacked) would not be +backwards-compatible because it would affect the behavior of the resulting +generators when used with ``.asend()``, ``.athrow()``, and ``.aclose()``. That +said, despite being backwards-incompatible, such a change would be unlikely to +have a large impact because it would only affect the behavior of structures +that, under this proposal, are not particularly useful. See +:ref:`pep798-alternativegenexpsemantics` for more discussion. .. _pep798-examples: @@ -642,7 +670,7 @@ resulting generator, but several alternatives were suggested in our discussion other aspects of this proposal are accepted. The reason to prefer this proposal over these alternatives is the preservation -of existent conventions for punctuation around generator expressions. +of existing conventions for punctuation around generator expressions. Currently, the general rule is that generator expressions must be wrapped in parentheses except when provided as the sole argument to a function, and this proposal suggests maintaining that rule even as we allow more kinds of @@ -700,6 +728,87 @@ PEP. As such, these forms should continue to raise a ``SyntaxError``, but with a new error message as described above, though it should not be ruled out as a consideration for future proposals. +.. _pep798-alternativegenexpsemantics: + +Alternative Generator Expression Semantics +------------------------------------------ + +Another point of discussion centered around the semantics of unpacking in +generator expressions, particularly the relationship between the semantics of +synchronous and asynchronous generator expressions given that async generators +do not support ``yield from`` (see the section of :pep:`525` on Asynchronous +``yield from``). + +The core question centered around whether sync and async generator expressions +should use ``yield from`` (or an equivalent) when unpacking, as opposed to an +explicit loop. The main difference between these options is whether the +resulting generator delegates to the objects being unpacked, which would affect +the behavior of these generator expressions when used with +``.send()/.asend()``, ``.throw()/.athrow()``, and ``.close()/.aclose()`` in the +case where the objects being unpacked are themselves generators. The +differences between these options are summarized in +:ref:`pep798-appendix-yieldfrom`. + +Several reasonable options were considered, none of which was a clear winner in +a `poll in the Discourse thread +`__. +Beyond the proposal outlined above, the following were also considered: + +1. Using explicit loops for both synchronous and asynchronous generator + expressions. + + This strategy would have resulted in a symmetry between synchronous and + asynchronous generator expressions but would have prevented a + potentially-useful tool by disallowing delegation in the case of synchronous + generator expressions. One specific concern with this approach is the + introduction of an asymmetry between synchronous and asynchronous + generators, but this concern is mitigated by the fact that these asymmetries + already exist between synchronous and asynchronous generators more + generally. + +2. Using ``yield from`` for unpacking in synchronous generator expressions and + mimicking the behavior of ``yield from`` for unpacking in async generator + expressions. + + This strategy would also make unpacking in synchronous and asynchronous + generators behave symmetrically, but it would also be more complex, enough + so that the cost may not be worth the benefit. As such, this PEP proposes + that generator expressions using the unpacking operator should not use + semantics similar to ``yield from`` until ``yield from`` is supported in + asynchronous generators more generally. + +3. Using ``yield from`` for unpacking in synchronous generator expressions, and + disallowing unpacking in asynchronous generator expressions until they + support ``yield from``. + + This strategy could possibly reduce friction if asynchronous generator + expressions do gain support for ``yield from`` in the future by making sure + that any decision made at that point would be fully backwards-compatible. + But the utility of unpacking in that context seems to outweigh the potential + downside of a minimally-invasive backwards-incompatible change in the future + if async generator expressions do receive support for ``yield from``. + +4. Disallowing unpacking in all generator expressions. + + This would retain symmetry between the two cases, but with the downside of + losing a very expressive form. + + +Each of these options (including the one presented in this PEP) has its +benefits and drawbacks, with no option being clearly superior on all fronts. +The semantics proposed in :ref:`pep798-genexpsemantics` represent a reasonable +compromise where unpacking in both synchronous and asynchronous generator +expressions mirrors common ways of writing equivalent generators currently. +Moreover, these subtle differences are unlikely to be impactful for common use +cases (for example, there is no difference for the likely most-common use case +of combining simple collections). + +As suggested above, this decision should be revisited in the event that +asynchronous generators receive support for ``yield from`` in the future, in +which case adjusting the semantics of unpacking in async generator expressions +to use ``yield from`` should be considered. + + Concerns and Disadvantages ========================== @@ -722,7 +831,7 @@ were raised as well. This section aims to summarize those concerns. Complex uses of unpacking in comprehensions could obscure logic that would be clearer in an explicit loop. While this is already a concern with comprehensions more generally, the addition of ``*`` and ``**`` may make - particularly-complex uses even more difficult to read and understand at a + particularly complex uses even more difficult to read and understand at a glance. For example, while these situations are likely rare, comprehensions that use unpacking in multiple ways can make it difficult to know what's being unpacked and when: ``f(*(*x for *x, _ in list_of_lists))``. @@ -737,8 +846,9 @@ were raised as well. This section aims to summarize those concerns. for maintainers of code formatters, linters, type checkers, etc., to make sure that the new syntax is supported. -Other Languages -=============== + +Appendix: Other Languages +========================= Quite a few other languages support this kind of flattening with syntax similar to what is already available in Python, but support for using unpacking syntax @@ -768,7 +878,7 @@ Many languages that support comprehensions support double loops: (for [xs [[1 2 3] [] [4 5]] x (concat xs xs)] x) Several other languages (even those without comprehensions) support these -operations via a built-in function/method to support flattening of nested +operations via a built-in function or method to support flattening of nested structures: .. code:: python @@ -778,7 +888,7 @@ structures: .. code:: javascript - // Javascript + // javascript [[1,2,3], [], [4,5]].flatMap(xs => [...xs, ...xs]) .. code:: haskell @@ -801,12 +911,157 @@ in Julia currently leads to a syntax error: As one counterexample, support for a similar syntax was recently added to `Civet `_. For example, the following is a valid comprehension in -Civet, making use of Javascript's ``...`` syntax for unpacking: +Civet, making use of JavaScript's ``...`` syntax for unpacking: .. code:: javascript for xs of [[1,2,3], [], [4,5]] then ...(xs++xs) +.. _pep798-appendix-yieldfrom: + +Appendix: Semantics of Generator Delegation +=========================================== + +One of the common questions about the semantics outlined above had to do with +the difference between using ``yield from`` when unpacking inside of a +generator expression, versus using an explicit loop. Because this is a +fairly-advanced feature of generators, this appendix attempts to summarize some +of the key differences between generators that use ``yield from`` and those +that use explicit loops. + +Basic Behavior +-------------- + +For simple iteration over values, which we expect to be by far the most-common +use of unpacking in generator expressions, both approaches produce identical +results:: + + def yield_from(iterables): + for iterable in iterables: + yield from iterable + + def explicit_loop(iterables): + for iterable in iterables: + for item in iterable: + yield item + + # Both produce the same sequence of values + x = list(yield_from([[1, 2], [3, 4]])) + y = list(explicit_loop([[1, 2], [3, 4]])) + print(x == y) # prints True + +Advanced Generator Protocol Differences +--------------------------------------- + +The differences become apparent when using the advanced generator protocol +methods ``.send()``, ``.throw()``, and ``.close()``, and when the sub-iterables +are themselves generators rather than simple sequences. In these cases, the +``yield from`` version results in the associated signal reaching the +subgenerator, but the version with the explicit loop does not. + +Delegation with ``.send()`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^ +.. code:: python + + def sub_generator(): + x = yield "first" + yield f"received: {x}" + yield "last" + + def yield_from(): + yield from sub_generator() + + def explicit_loop(): + for item in sub_generator(): + yield item + + # With yield from, values are passed through to sub-generator + gen1 = yield_from() + print(next(gen1)) # prints "first" + print(gen1.send("hello")) # prints "received: hello" + print(next(gen1)) # prints "last" + + # With explicit loop, .send() affects the outer generator; values don't reach the sub-generator + gen2 = explicit_loop() + print(next(gen2)) # prints "first" + print(gen2.send("hello")) # prints "received: None" (sub-generator receives None instead of "hello") + print(next(gen2)) # prints "last" + +Exception Handling with ``.throw()`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + def sub_generator_with_exception_handling(): + try: + yield "first" + yield "second" + except ValueError as e: + yield f"caught: {e}" + + def yield_from(): + yield from sub_generator_with_exception_handling() + + def explicit_loop(): + for item in sub_generator_with_exception_handling(): + yield item + + # With yield from, exceptions are passed to sub-generator + gen1 = yield_from() + print(next(gen1)) # prints "first" + print(gen1.throw(ValueError("test"))) # prints "caught: test" + + # With explicit loop, exceptions affect the outer generator only + gen2 = explicit_loop() + print(next(gen2)) # prints "first" + print(gen2.throw(ValueError("test"))) # ValueError is raised; sub-generator doesn't see it + +Generator Cleanup with ``.close()`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + # hold references to sub-generators so GC doesn't close the explicit loop version + references = [] + + def sub_generator_with_cleanup(): + try: + yield "first" + yield "second" + finally: + print("sub-generator received GeneratorExit") + + def yield_from(): + try: + g = sub_generator_with_cleanup() + references.append(g) + yield from g + finally: + print("outer generator received GeneratorExit") + + def explicit_loop(): + try: + g = sub_generator_with_cleanup() + references.append(g) + for item in g: + yield item + finally: + print("outer generator received GeneratorExit") + + # With yield from, GeneratorExit is passed through to sub-generator + gen1 = yield_from() + print(next(gen1)) # prints "first" + gen1.close() # closes sub-generator and then outer generator + + # With explicit loop, GeneratorExit goes to outer generator only + gen2 = explicit_loop() + print(next(gen2)) # prints "first" + gen2.close() # only closes outer generator + + print('program finished; GC will close the explicit loop subgenerator') + # second inner generator closes when GC closes it at the end + + References ==========