From e893293b4e6a7db3373c2d46a7d2a5318d1b914e Mon Sep 17 00:00:00 2001 From: Pablo Galindo Salgado Date: Thu, 9 Oct 2025 01:08:05 +0100 Subject: [PATCH] PEP 810: Updates from discussion feedback --- peps/pep-0810.rst | 183 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 138 insertions(+), 45 deletions(-) diff --git a/peps/pep-0810.rst b/peps/pep-0810.rst index 1d4df76210a..718dc9b455c 100644 --- a/peps/pep-0810.rst +++ b/peps/pep-0810.rst @@ -155,12 +155,15 @@ the brainstorming we have already completed in preparation for this proposal as reference. The choice to introduce a new ``lazy`` keyword reflects the need for explicit -syntax. Import behavior is too fundamental to be left implicit or hidden -behind global flags or environment variables. By marking laziness directly at -the import site, the intent is immediately visible to both readers and tools. -This avoids surprises, reduces the cognitive burden of reasoning about -imports, and keeps lazy import semantics in line with Python's tradition of -explicitness. +syntax. Lazy imports have different semantics from normal imports: errors and +side effects occur at first use rather than at the import statement. This +semantic difference makes it critical that laziness is visible at the import +site itself, not hidden in global configuration or distant module-level +declarations. The ``lazy`` keyword provides local reasoning about import +behavior, avoiding the need to search elsewhere in the code to understand +whether an import is deferred. The rest of the import semantics remain +unchanged: the same import machinery, module finding, and loading mechanisms +are used. Another important decision is to represent lazy imports with proxy objects in the module's namespace, rather than by modifying dictionary lookup. Earlier @@ -280,13 +283,19 @@ Examples of syntax errors: Semantics --------- -When the ``lazy`` keyword is used, the import becomes *potentially lazy*. -Unless lazy imports are disabled or suppressed (see below), the module is not +When the ``lazy`` keyword is used, the import becomes *potentially lazy* +(see `Lazy imports filter`_ for advanced override mechanisms). The module is not loaded immediately at the import statement; instead, a lazy proxy object is created and bound to the name. The actual module is loaded on first use of that name. -Example: +When using ``lazy from ... import``, **each imported name** is bound to a lazy +proxy object. The first access to **any** of these names triggers loading of +the entire module and reifies **only that specific name** to its actual value. +Other names remain as lazy proxies until they are accessed. The interpreter's +adaptive specialization will optimize away the lazy checks after a few accesses. + +Example with ``lazy import``: .. code-block:: python @@ -301,6 +310,24 @@ Example: print('json' in sys.modules) # True - now loaded +Example with ``lazy from ... import``: + +.. code-block:: python + + import sys + + lazy from json import dumps, loads + + print('json' in sys.modules) # False - module not loaded yet + + # First use of 'dumps' triggers loading json and reifies ONLY 'dumps' + result = dumps({"hello": "world"}) + + print('json' in sys.modules) # True - module now loaded + + # Accessing 'loads' now reifies it (json already loaded, no re-import) + data = loads(result) + A module may contain a :data:`!__lazy_modules__` attribute, which is a sequence of fully qualified module names (strings) to make *potentially lazy* (as if the ``lazy`` keyword was used). This attribute is checked on each @@ -327,13 +354,12 @@ import is ever imported lazily, and the behavior is equivalent to a regular import statement: the import is *eager* (as if the lazy keyword was not used). Finally, the application may use a custom filter function on all *potentially -lazy* imports to determine if they should be lazy or not. -If a filter function is set, it will be called with the name of the module -doing the import, the name of the module being imported, and (if applicable) -the fromlist. -An import remains lazy only if the filter function returns ``True``. - -If no lazy import filter is set, all *potentially lazy* imports are lazy. +lazy* imports to determine if they should be lazy or not (this is an advanced +feature, see `Lazy imports filter`_). If a filter function is set, it will be +called with the name of the module doing the import, the name of the module +being imported, and (if applicable) the fromlist. An import remains lazy only +if the filter function returns ``True``. If no lazy import filter is set, all +*potentially lazy* imports are lazy. Lazy objects ------------ @@ -586,19 +612,24 @@ After several calls, ``LOAD_GLOBAL`` specializes to ``LOAD_GLOBAL_MODULE``: Lazy imports filter ------------------- -This PEP adds the following new functions to manage the lazy imports filter: +*Note: This is an advanced feature. Library developers should NOT call these +functions. These are intended for specialized/advanced users who need +fine-grained control over lazy import behavior when using the global flags.* + +This PEP adds the following new functions to the ``sys`` module to manage the +lazy imports filter: -* ``importlib.set_lazy_imports_filter(func)`` - Sets the filter function. If +* ``sys.set_lazy_imports_filter(func)`` - Sets the filter function. If ``func=None`` then the import filter is removed. The ``func`` parameter must have the signature: ``func(importer: str, name: str, fromlist: tuple[str, ...] | None) -> bool`` -* ``importlib.get_lazy_imports_filter()`` - Returns the currently installed +* ``sys.get_lazy_imports_filter()`` - Returns the currently installed filter function, or ``None`` if no filter is set. -* ``importlib.set_lazy_imports(enabled=None, /)`` - Programmatic API for - controlling lazy imports at runtime. The ``enabled`` parameter can be - ``None`` (respect ``lazy`` keyword only), ``True`` (force all imports to be - potentially lazy), or ``False`` (force all imports to be eager). +* ``sys.set_lazy_imports(mode, /)`` - Programmatic API for + controlling lazy imports at runtime. The ``mode`` parameter can be + ``"normal"`` (respect ``lazy`` keyword only), ``"all"`` (force all imports to be + potentially lazy), or ``"none"`` (force all imports to be eager). The filter function is called for every potentially lazy import, and must return ``True`` if the import should be lazy. This allows for fine-grained @@ -646,7 +677,7 @@ Example: return True # Allow lazy import # Install the filter - importlib.set_lazy_imports_filter(exclude_side_effect_modules) + sys.set_lazy_imports_filter(exclude_side_effect_modules) # These imports are checked by the filter lazy import data_processor # Filter returns True -> stays lazy @@ -662,11 +693,15 @@ Example: Global lazy imports control ---------------------------- +*Note: This is an advanced feature. Library developers should NOT use the global +activation mechanism. This is intended for application developers and framework +authors who need to control lazy imports across their entire application.* + The global lazy imports flag can be controlled through: * The ``-X lazy_imports=`` command-line option * The ``PYTHON_LAZY_IMPORTS=`` environment variable -* The ``importlib.set_lazy_imports(mode)`` function (primarily for testing) +* The ``sys.set_lazy_imports(mode)`` function (primarily for testing) Where ```` can be: @@ -687,12 +722,10 @@ lazy* import is ever imported lazily, the import filter is never called, and the behavior is equivalent to a regular ``import`` statement: the import is *eager* (as if the lazy keyword was not used). -Python code can run the :func:`!importlib.set_lazy_imports` function to override +Python code can run the :func:`!sys.set_lazy_imports` function to override the state of the global lazy imports flag inherited from the environment or CLI. This is especially useful if an application needs to ensure that all imports -are evaluated eagerly, via ``importlib.set_lazy_imports('none')``. -Alternatively, :func:`!importlib.set_lazy_imports` can be used with boolean -values for programmatic control. +are evaluated eagerly, via ``sys.set_lazy_imports("none")``. Backwards Compatibility @@ -790,7 +823,7 @@ The `pyperformance suite`_ confirms the implementation is performance-neutral. Filter function performance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The filter function (set via ``importlib.set_lazy_imports_filter()``) is called for +The filter function (set via ``sys.set_lazy_imports_filter()``) is called for every *potentially lazy* import to determine whether it should actually be lazy. When no filter is set, this is simply a NULL check (testing whether a filter function has been registered), which is a highly predictable branch that @@ -914,8 +947,8 @@ Security Implications There are no known security vulnerabilities introduced by lazy imports. Security-sensitive tools that need to ensure all imports are evaluated eagerly -can use :func:`!importlib.set_lazy_imports` with ``enabled=False`` to force -eager evaluation, or use :func:`!importlib.set_lazy_imports_filter` for fine-grained +can use :func:`!sys.set_lazy_imports` with ``"none"`` to force +eager evaluation, or use :func:`!sys.set_lazy_imports_filter` for fine-grained control. How to Teach This @@ -1020,6 +1053,31 @@ global approach. The key differences are: - **Simpler implementation**: Uses proxy objects instead of modifying core dictionary behavior. +What changes at reification time? What stays the same? +------------------------------------------------------ + +**What changes (the timing):** + +* **When** the module is imported - deferred to first use instead of at the + import statement +* **When** import errors occur - at first use rather than at import time +* **When** module-level side effects execute - at first use rather than at + import time + +**What stays the same (everything else):** + +* The import machinery used - same ``__import__``, same hooks, same loaders +* The module object created - identical to an eagerly imported module +* Import state consulted - ``sys.path``, ``sys.meta_path``, etc. at + **reification time** (not at import statement time) +* Module attributes and behavior - completely indistinguishable after + reification +* Thread safety - same import lock discipline as normal imports + +In other words: lazy imports only change **when** something happens, not +**what** happens. After reification, a lazy-imported module is +indistinguishable from an eagerly imported one. + What happens when lazy imports encounter errors? ------------------------------------------------ @@ -1269,7 +1327,7 @@ exclude specific modules that are known to have problematic side effects: return False # Import eagerly return True # Allow lazy import - importlib.set_lazy_imports_filter(my_filter) + sys.set_lazy_imports_filter(my_filter) The filter function receives the importer module name, the module being imported, and the fromlist (if using ``from ... import``). Returning ``False`` @@ -1638,18 +1696,36 @@ Making ``lazy`` imports find the module without loading it The Python ``import`` machinery separates out finding a module and loading it, and the lazy import implementation could technically defer only the -loading part. However: - -- Finding the module does not guarantee the import will succeed, nor even - that it will not raise ImportError. -- Finding modules in packages requires that those packages are loaded, so - it would only help with lazy loading one level of a package hierarchy. -- Since "finding" attributes in modules *requires* loading them, this would - create a hard to explain difference between - ``from package import module`` and ``from module import function``. -- A significant part of the performance win is skipping the finding part - (which may involve filesystem searches and consulting multiple importers - and meta-importers). +loading part. However, this approach was rejected for several critical reasons. + +A significant part of the performance win comes from skipping the finding phase. +The issue is particularly acute on NFS-backed filesystems and distributed +storage, where each ``stat()`` call incurs network latency. In these kinds of +environments, ``stat()`` calls can take tens to hundreds of milliseconds +depending on network conditions. With dozens of imports each doing multiple +filesystem checks traversing ``sys.path``, the time spent finding modules +before executing any Python code can become substantial. In some measurements, +spec finding accounts for the majority of total import time. Skipping only the +loading phase would leave most of the performance problem unsolved. + +More critically, separating finding from loading creates the worst of both +worlds for error handling. Some exceptions from the import machinery (e.g., +``ImportError`` from a missing module, path resolution failures, +``ModuleNotFoundError``) would be raised at the ``lazy import`` statement, while +others (e.g., ``SyntaxError``, ``ImportError`` from circular imports, attribute +errors from ``from module import name``) would be raised later at first use. +This split is both confusing and unpredictable: developers would need to +understand the internal import machinery to know which errors happen when. The +current design is simpler: with full lazy imports, all import-related errors +occur at first use, making the behavior consistent and predictable. + +Additionally, there are technical limitations: finding the module does not +guarantee the import will succeed, nor even that it will not raise ImportError. +Finding modules in packages requires that those packages are loaded, so it +would only help with lazy loading one level of a package hierarchy. Since +"finding" attributes in modules *requires* loading them, this would create a +hard to explain difference between ``from package import module`` and +``from module import function``. Placing the ``lazy`` keyword in the middle of from imports ---------------------------------------------------------- @@ -1706,6 +1782,23 @@ to add more specific re-enabling mechanisms later, when we have a clearer picture of real-world use and patterns, than it is to remove a hastily added mechanism that isn't quite right. +Using underscore-prefixed names for advanced features +------------------------------------------------------ + +The global activation and filter functions (``sys.set_lazy_imports``, +``sys.set_lazy_imports_filter``, ``sys.get_lazy_imports_filter``) could be +marked as "private" or "advanced" by using underscore prefixes (e.g., +``sys._set_lazy_imports_filter``). This was rejected because branding as +advanced features through documentation is sufficient. These functions have +legitimate use cases for advanced users, particularly operators of large +deployments. Providing an official mechanism prevents divergence from upstream +CPython. The global mode is intentionally documented as an advanced feature for +operators running huge fleets, not for day-to-day users or libraries. Python +has precedent for advanced features that remain public APIs without underscore +prefixes - for example, ``gc.disable()``, ``gc.get_objects()``, and +``gc.set_threshold()`` are advanced features that can cause issues if misused, +yet they are not underscore-prefixed. + Using a decorator syntax for lazy imports ------------------------------------------