From 1170253793cd0f7b347f822b9631cb6fa86c7d0d Mon Sep 17 00:00:00 2001 From: Brett Cannon Date: Tue, 10 Jun 2025 14:33:56 -0700 Subject: [PATCH 1/2] PEP 751: Updates based on feedback Mostly simplifying and loosening the spec. --- peps/pep-0794.rst | 131 +++++++++++++++++++++++++--------------------- 1 file changed, 71 insertions(+), 60 deletions(-) diff --git a/peps/pep-0794.rst b/peps/pep-0794.rst index f7321be0d86..4c4336bdbce 100644 --- a/peps/pep-0794.rst +++ b/peps/pep-0794.rst @@ -15,10 +15,10 @@ Abstract This PEP proposes extending the core metadata specification for Python packaging to include a new, repeatable field named ``Import-Name`` to record -the import names that a project owns once installed. A new key named +the import names that a project owns/provides once installed. A new key named ``import-names`` will be added to the ``[project]`` table in -``pyproject.toml``. This also leads to the introduction of core metadata -version 2.5. +``pyproject.toml`` for providing the values for the new core metadata field. +This also leads to the introduction of core metadata version 2.5. Motivation @@ -32,8 +32,8 @@ the right project to install when they know the import name or knowing what import names a project will provide once installed. As an example, a code editor may detect a user has an unsatisfied import in a -selected virtual environment. But with no way to reliably gather the import -names that various projects provide, the code editor cannot accurately +selected virtual environment. But with no way to reliably know what import +names various projects provide, the code editor cannot accurately provide a user with a list of potential projects to install to satisfy that import requirement (e.g. it is not obvious that ``import PIL`` very likely implies the user wants the `Pillow project @@ -56,21 +56,32 @@ This PEP proposes extending the packaging :ref:`packaging:core-metadata` so that project owners can specify the highest-level import names that a project provides and owns if installed. -By keeping the information to the import names a project would own (i.e. not -implicit namespace packages but modules, regular packages, submodules, and -subpackages in an explicit namespace package), it makes it clear which -project maps directly to what import name once the project is installed. - -By keeping it to the highest-level name that's owned, it keeps the data small -and allows for inferring implicit namespace packages that a project -contributes to. This will hopefully encourage use when appropriate by not -being a burden to provide appropriate information. +By keeping the information to the import names a project would own if +installed, it makes it clear which project maps directly to what import name +once the project is installed. Putting this metadata in the core metadata means the data is (potentially) served independently of any sdist or wheel by an index server. That negates needing to come up with another way to expose the metadata to tools to avoid having to download an entire e.g. wheel. +Having this metadata be the same across all release artifacts would allow for +projects to only have to check a single file's core metadata to get all +possible import names instead of checking all the released files. This also +means one does not need to worry if a file is missing when reading the core +metadata or one can work solely from an sdist if the metadata is provided. As +well, it simplifies having a ``project.import-names`` key in +``pyproject.toml`` by having it be consistent for the entire project version +and not unique per released file for the same version. + +This PEP is not overly strict on what to (not) list in the proposed metadata on +purpose. Having build back-ends verify that a project is accurately following +a specification that is somehow strict about what can be listed would be near +impossible to get right due to how flexible Python's import system is. As such, +this PEP only requires that valid import names be used and that projects don't +lie (and it is acknowledged the latter requirements cannot be validated +programmatically). + Various other attempts have been made to solve this, but they all have to make various trade-offs. For instance, one could download every wheel for every project release and look at what files are provided via the @@ -79,9 +90,9 @@ bandwidth for something that is static information (although tricks can be used to lessen the data requests such as using HTTP range requests to only read the table of contents of the zip file). This sort of calculation is also currently repeated by everyone independently instead of having the metadata -hosted by a central index server like PyPI. It also doesn't work for sdists as -the structure of the wheel isn't known yet, and so inferring the structure of -the code installed isn't known yet. As well, these solutions are not +hosted by a central index server like PyPI. It also doesn't work for sdists +as the structure of the wheel isn't known yet, and so inferring the structure +of the code installed isn't known yet. As well, these solutions are not necessarily accurate as it is based on inference instead of being explicitly provided by the project owners. @@ -93,68 +104,54 @@ Because this PEP introduces a new field to the core metadata, it bumps the latest core metadata version to 2.5. The ``Import-Name`` field is a "multiple uses" field. Each entry of -``Import-Name`` represents an importable name that the project provides. The -names provided MUST be importable via *some* artifact the project provides -for that version, i.e. the metadata MUST be consistent across all sdists and -wheels for a project release to avoid having to read every file to find -variances. It also avoids having to declare this field as dynamic in an -sdist due to the import names varying across wheels. This does imply that the -information isn't specific to the distribution artifact it is found in, but -for the release version the distribution artifact belongs to. - -The names provided MUST be one of the following: - -- Highest-level, regular packages -- Top-level modules -- The submodules and regular subpackages within implicit namespace packages - -provided by the project. This makes the vast majority of projects only -needing a single ``Import-Name`` entry which represents the top-level, -regular package the project provides. But it also allows for implicit -namespace packages to be able to differentiate among themselves (e.g., it -avoids having all projects contributing to the ``azure`` namespace via an -implicit namespace package all having ``azure`` as their entry for -``Import-Name``, but instead a more accurate entry like -``azure.mgmt.search``) +``Import-Name`` MUST be a valid import name. The names specified in +``Import-Name`` MUST be importable when the project is installed on *some* +platform for the same version of the project (i.e. the metadata MUST be +consistent across all sdists and wheels for a project release). This does +imply that the information isn't specific to the distribution artifact it is +found in, but for the release version the distribution artifact belongs to. + +Projects are not expected to list every single import name that is provided. +Instead, projects SHOULD list the highest-level/shortest import name that the +project would "own" when installed. For example, if you install a project +that has a single package named ``myproj`` which itself has multiple +submodules, the expectation is only ``myproj`` would be listed in +``Import-Name`` and not every submodule. If a project is part of a namespace +package named ``ns`` and it provides a subpackage called ``ns.myproj`` (i.e. +``ns.myproj.__init__`` exists), then ``ns.myproj`` should be listed in +``Import-Name``, but NOT ``ns`` alone as that is not "owned" by the project +upon installation (i.e. other projects can be installed which also contribute to +``ns``). If a project chooses not to provide any ``Import-Name`` entries, tools MAY -assume the import name matches the project name. - -Project owners MUST specify accurate information when provided and SHOULD be -exhaustive in what they provide. Project owners SHOULD NOT filter out names -that they consider private. This is because even "private" names can be -imported by anyone and can "take up space" in the namespace of the -environment. Tools consuming the metadata SHOULD consider the information -provided in ``Import-Name`` as accurate, but not exhaustive. +assume the import name matches the project name (including de-normalization of +the project name, e.g. ``my-proj`` as ``my_proj``). The :ref:`declaring-project-metadata` will gain an ``import-names`` key. It will be an array of strings that stores what will be written out to ``Import-Name``. Build back-ends MAY support dynamically calculating the -value on the user's behalf if desired, if the user declares the key to be -dynamic. +value on the user's behalf if desired, if the user declares the key in +``project.dynamic``. Examples -------- `In httpx 0.28.1 -`__ -there would be only a single entry for the ``httpx`` package as it's a -regular package and there are no other regular packages or modules at the top -of the project. +`__ , +an entry for the ``httpx`` package. `In pytest 8.3.5 `__ -there would be 3 entries: +there would be 3 expected entries: -1. ``_pytest`` (a top-level, regular package) -2. ``py`` (a top-level module) -3. ``pytest`` (a top-level, regular package) +1. ``_pytest`` +2. ``py`` +3. ``pytest`` In `azure-mgmt-search 9.1.0 `__, -there would be a single entry for ``azure.mgmt.search`` as ``azure`` and -``azure.mgmt`` are implicit namespace packages. +there should be a single entry for ``azure.mgmt.search``. Backwards Compatibility @@ -236,6 +233,19 @@ In the end a `poll held and the approach this PEP takes won out. +Be more prescriptive in what projects specify +--------------------------------------------- + +An earlier version of this PEP was much more strict in what could be put into +``Import-Name``. This included turning some "SHOULD" guidelines into "MUST" +requirements and being specific about how to calculate what a project "owned". +In the end it was decided that was too restrictive and risked being implemented +incorrectly or the spec being unexpectedy too strict. + +Since the metadata was never expected to be exhaustive as it can't be verified +to be, the looser spec that is currently in this PEP was chosen instead. + + Open Issues =========== @@ -258,3 +268,4 @@ Copyright This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. +CC0-1.0-Universal license, whichever is more permissive. From 463fa2105936b416fb637ba3e0703ab0d699db63 Mon Sep 17 00:00:00 2001 From: Brett Cannon Date: Tue, 10 Jun 2025 14:36:13 -0700 Subject: [PATCH 2/2] Update CODEOWNERS --- .github/CODEOWNERS | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index daadb3c6102..e000d3934b1 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -666,13 +666,12 @@ peps/pep-0785.rst @gpshead # ... peps/pep-0787.rst @ncoghlan peps/pep-0788.rst @ZeroIntensity @vstinner -# ... peps/pep-0789.rst @njsmith peps/pep-0790.rst @hugovk peps/pep-0791.rst @vstinner peps/pep-0792.rst @dstufft -# ... peps/pep-0793.rst @encukou +peps/pep-0794.rst @brettcannon # ... peps/pep-0801.rst @warsaw # ...