Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -666,13 +666,12 @@ peps/pep-0785.rst @gpshead
# ...
peps/pep-0787.rst @ncoghlan
peps/pep-0788.rst @ZeroIntensity @vstinner
# ...
peps/pep-0789.rst @njsmith
peps/pep-0790.rst @hugovk
peps/pep-0791.rst @vstinner
peps/pep-0792.rst @dstufft
# ...
peps/pep-0793.rst @encukou
peps/pep-0794.rst @brettcannon
# ...
peps/pep-0801.rst @warsaw
# ...
Expand Down
131 changes: 71 additions & 60 deletions peps/pep-0794.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ Abstract

This PEP proposes extending the core metadata specification for Python
packaging to include a new, repeatable field named ``Import-Name`` to record
the import names that a project owns once installed. A new key named
the import names that a project owns/provides once installed. A new key named
``import-names`` will be added to the ``[project]`` table in
``pyproject.toml``. This also leads to the introduction of core metadata
version 2.5.
``pyproject.toml`` for providing the values for the new core metadata field.
This also leads to the introduction of core metadata version 2.5.


Motivation
Expand All @@ -32,8 +32,8 @@ the right project to install when they know the import name or knowing what
import names a project will provide once installed.

As an example, a code editor may detect a user has an unsatisfied import in a
selected virtual environment. But with no way to reliably gather the import
names that various projects provide, the code editor cannot accurately
selected virtual environment. But with no way to reliably know what import
names various projects provide, the code editor cannot accurately
provide a user with a list of potential projects to install to satisfy that
import requirement (e.g. it is not obvious that ``import PIL`` very likely
implies the user wants the `Pillow project
Expand All @@ -56,21 +56,32 @@ This PEP proposes extending the packaging :ref:`packaging:core-metadata` so
that project owners can specify the highest-level import names that a project
provides and owns if installed.

By keeping the information to the import names a project would own (i.e. not
implicit namespace packages but modules, regular packages, submodules, and
subpackages in an explicit namespace package), it makes it clear which
project maps directly to what import name once the project is installed.

By keeping it to the highest-level name that's owned, it keeps the data small
and allows for inferring implicit namespace packages that a project
contributes to. This will hopefully encourage use when appropriate by not
being a burden to provide appropriate information.
By keeping the information to the import names a project would own if
installed, it makes it clear which project maps directly to what import name
once the project is installed.

Putting this metadata in the core metadata means the data is (potentially)
served independently of any sdist or wheel by an index server. That negates
needing to come up with another way to expose the metadata to tools to avoid
having to download an entire e.g. wheel.

Having this metadata be the same across all release artifacts would allow for
projects to only have to check a single file's core metadata to get all
possible import names instead of checking all the released files. This also
means one does not need to worry if a file is missing when reading the core
metadata or one can work solely from an sdist if the metadata is provided. As
well, it simplifies having a ``project.import-names`` key in
``pyproject.toml`` by having it be consistent for the entire project version
and not unique per released file for the same version.

This PEP is not overly strict on what to (not) list in the proposed metadata on
purpose. Having build back-ends verify that a project is accurately following
a specification that is somehow strict about what can be listed would be near
impossible to get right due to how flexible Python's import system is. As such,
this PEP only requires that valid import names be used and that projects don't
lie (and it is acknowledged the latter requirements cannot be validated
programmatically).

Various other attempts have been made to solve this, but they all have to
make various trade-offs. For instance, one could download every wheel for
every project release and look at what files are provided via the
Expand All @@ -79,9 +90,9 @@ bandwidth for something that is static information (although tricks can be
used to lessen the data requests such as using HTTP range requests to only
read the table of contents of the zip file). This sort of calculation is also
currently repeated by everyone independently instead of having the metadata
hosted by a central index server like PyPI. It also doesn't work for sdists as
the structure of the wheel isn't known yet, and so inferring the structure of
the code installed isn't known yet. As well, these solutions are not
hosted by a central index server like PyPI. It also doesn't work for sdists
as the structure of the wheel isn't known yet, and so inferring the structure
of the code installed isn't known yet. As well, these solutions are not
necessarily accurate as it is based on inference instead of being explicitly
provided by the project owners.

Expand All @@ -93,68 +104,54 @@ Because this PEP introduces a new field to the core metadata, it bumps the
latest core metadata version to 2.5.

The ``Import-Name`` field is a "multiple uses" field. Each entry of
``Import-Name`` represents an importable name that the project provides. The
names provided MUST be importable via *some* artifact the project provides
for that version, i.e. the metadata MUST be consistent across all sdists and
wheels for a project release to avoid having to read every file to find
variances. It also avoids having to declare this field as dynamic in an
sdist due to the import names varying across wheels. This does imply that the
information isn't specific to the distribution artifact it is found in, but
for the release version the distribution artifact belongs to.

The names provided MUST be one of the following:

- Highest-level, regular packages
- Top-level modules
- The submodules and regular subpackages within implicit namespace packages

provided by the project. This makes the vast majority of projects only
needing a single ``Import-Name`` entry which represents the top-level,
regular package the project provides. But it also allows for implicit
namespace packages to be able to differentiate among themselves (e.g., it
avoids having all projects contributing to the ``azure`` namespace via an
implicit namespace package all having ``azure`` as their entry for
``Import-Name``, but instead a more accurate entry like
``azure.mgmt.search``)
``Import-Name`` MUST be a valid import name. The names specified in
``Import-Name`` MUST be importable when the project is installed on *some*
platform for the same version of the project (i.e. the metadata MUST be
consistent across all sdists and wheels for a project release). This does
imply that the information isn't specific to the distribution artifact it is
found in, but for the release version the distribution artifact belongs to.

Projects are not expected to list every single import name that is provided.
Instead, projects SHOULD list the highest-level/shortest import name that the
project would "own" when installed. For example, if you install a project
that has a single package named ``myproj`` which itself has multiple
submodules, the expectation is only ``myproj`` would be listed in
``Import-Name`` and not every submodule. If a project is part of a namespace
package named ``ns`` and it provides a subpackage called ``ns.myproj`` (i.e.
``ns.myproj.__init__`` exists), then ``ns.myproj`` should be listed in
``Import-Name``, but NOT ``ns`` alone as that is not "owned" by the project
upon installation (i.e. other projects can be installed which also contribute to
``ns``).

If a project chooses not to provide any ``Import-Name`` entries, tools MAY
assume the import name matches the project name.

Project owners MUST specify accurate information when provided and SHOULD be
exhaustive in what they provide. Project owners SHOULD NOT filter out names
that they consider private. This is because even "private" names can be
imported by anyone and can "take up space" in the namespace of the
environment. Tools consuming the metadata SHOULD consider the information
provided in ``Import-Name`` as accurate, but not exhaustive.
assume the import name matches the project name (including de-normalization of
the project name, e.g. ``my-proj`` as ``my_proj``).

The :ref:`declaring-project-metadata` will gain an ``import-names`` key. It
will be an array of strings that stores what will be written out to
``Import-Name``. Build back-ends MAY support dynamically calculating the
value on the user's behalf if desired, if the user declares the key to be
dynamic.
value on the user's behalf if desired, if the user declares the key in
``project.dynamic``.


Examples
--------

`In httpx 0.28.1
<https://pypi-browser.org/package/httpx/httpx-0.28.1-py3-none-any.whl>`__
there would be only a single entry for the ``httpx`` package as it's a
regular package and there are no other regular packages or modules at the top
of the project.
<https://pypi-browser.org/package/httpx/httpx-0.28.1-py3-none-any.whl>`__ ,
an entry for the ``httpx`` package.

`In pytest 8.3.5
<https://pypi-browser.org/package/pytest/pytest-8.3.5-py3-none-any.whl>`__
there would be 3 entries:
there would be 3 expected entries:

1. ``_pytest`` (a top-level, regular package)
2. ``py`` (a top-level module)
3. ``pytest`` (a top-level, regular package)
1. ``_pytest``
2. ``py``
3. ``pytest``

In `azure-mgmt-search 9.1.0
<https://pypi-browser.org/package/azure-mgmt-search/azure_mgmt_search-9.1.0-py3-none-any.whl>`__,
there would be a single entry for ``azure.mgmt.search`` as ``azure`` and
``azure.mgmt`` are implicit namespace packages.
there should be a single entry for ``azure.mgmt.search``.


Backwards Compatibility
Expand Down Expand Up @@ -236,6 +233,19 @@ In the end a `poll
held and the approach this PEP takes won out.


Be more prescriptive in what projects specify
---------------------------------------------

An earlier version of this PEP was much more strict in what could be put into
``Import-Name``. This included turning some "SHOULD" guidelines into "MUST"
requirements and being specific about how to calculate what a project "owned".
In the end it was decided that was too restrictive and risked being implemented
incorrectly or the spec being unexpectedy too strict.

Since the metadata was never expected to be exhaustive as it can't be verified
to be, the looser spec that is currently in this PEP was chosen instead.


Open Issues
===========

Expand All @@ -258,3 +268,4 @@ Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
CC0-1.0-Universal license, whichever is more permissive.