Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 109 additions & 53 deletions peps/pep-0794.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,12 @@ Abstract
========

This PEP proposes extending the core metadata specification for Python
packaging to include a new, repeatable field named ``Import-Name`` to record
the import names that a project owns/provides once installed. A new key named
``import-names`` will be added to the ``[project]`` table in
``pyproject.toml`` for providing the values for the new core metadata field.
This also leads to the introduction of core metadata version 2.5.
packaging to include two new, repeatable fields named ``Import-Name`` and
``Import-Namespace`` to record the import names that a project provides once
installed. New keys named ``import-names`` and ``import-namespaces`` will be
added to the ``[project]`` table in ``pyproject.toml`` for providing the values
for the new core metadata field. This also leads to the introduction of core
metadata version 2.5.


Motivation
Expand All @@ -43,6 +44,13 @@ name(s) and would have their memory jogged when seeing a list of import names
a package provides. Finally, tools would be able to notify users what import
names will become available once they install a project.

There is also no easy way to know whether installing two projects will conflict
with one another based on the import names they provide. For instance, if two
different projects have a ``_utils`` module, installing both project will lead
to a clash as one project's ``_utils`` module would take precedence over the
other project's version. This issue has been
`seen in the wild <https://github.com/astral-sh/uv/pull/13437>`__.

It may also help with spam detection. If a project specifies the same import
names as a very popular project it can act as a signal to take a closer look
at the validity of the less popular project. A project found to be lying
Expand All @@ -54,29 +62,31 @@ Rationale

This PEP proposes extending the packaging :ref:`packaging:core-metadata` so
that project owners can specify the highest-level import names that a project
provides and owns if installed on some platform.
provides if installed on some platform.

Putting this metadata in the core metadata means the data is (potentially)
served independently of any sdist or wheel by an index server. That negates
needing to come up with another way to expose the metadata to tools to avoid
served by an index server, independent of any sdist or wheel. That negates
needing to come up with a way to expose the metadata to tools to avoid
having to download an entire e.g. wheel.

Having this metadata be the same across all release artifacts would allow for
projects to only have to check a single file's core metadata to get all
possible import names instead of checking all the released files. This also
means one does not need to worry if a file is missing when reading the core
metadata or one can work solely from an sdist if the metadata is provided. As
well, it simplifies having a ``project.import-names`` key in
``pyproject.toml`` by having it be consistent for the entire project version
and not unique per released file for the same version.
well, it simplifies having ``project.import-names`` and ``project.import-namespaces``
keys in ``pyproject.toml`` by having it be consistent for the entire project
version and not unique per released file for the same version.

This PEP is not overly strict on what to (not) list in the proposed metadata on
purpose. Having build back-ends verify that a project is accurately following
a specification that is somehow strict about what can be listed would be near
impossible to get right due to how flexible Python's import system is. As such,
this PEP only requires that valid import names be used and that projects don't
lie (and it is acknowledged the latter requirements cannot be validated
programmatically).
programmatically). Project do, though, need to account for all levels of the
names they list (e.g. you can't list ``a.b.c`` and not account for ``a`` and
``a.b``).

Various other attempts have been made to solve this, but they all have to
make various trade-offs. For instance, one could download every wheel for
Expand All @@ -101,55 +111,92 @@ Specification
Because this PEP introduces a new field to the core metadata, it bumps the
latest core metadata version to 2.5.

The ``Import-Name`` field is a "multiple uses" field. Each entry of
``Import-Name`` MUST be a valid import name. The names specified in
``Import-Name`` MUST be importable when the project is installed on *some*
platform for the same version of the project (e.g. the metadata MUST be
consistent across all sdists and wheels for a project release). This does
imply that the information isn't specific to the distribution artifact it is
found in, but for the release version the distribution artifact belongs to.

Projects are not required to list every single import name that is provided.
Instead, projects SHOULD list the highest-level/shortest import name that the
project would "own" when installed (this includes "private" names). For
example, if you install a project that has a single package named ``myproj``
which itself has multiple submodules, the expectation is only ``myproj``
would be listed in ``Import-Name`` and not every submodule. If a project is
part of a namespace package named ``ns`` and it provides a subpackage called
``ns.myproj`` (i.e. ``ns.myproj.__init__`` exists), then ``ns.myproj`` should
be listed in ``Import-Name``, but NOT ``ns`` alone as that is not "owned" by
the project upon installation (i.e. other projects can be installed which
also contribute to ``ns``).

If a project chooses not to provide any ``Import-Name`` entries, tools MAY
assume the import name matches the project name (including de-normalization of
the project name, e.g. ``my-proj`` as ``my_proj``).
The ``Import-Name`` and ``Import-Namespace`` fields are "multiple uses" fields.
Each entry of both fields MUST be a valid import name. The names specified MUST
be importable when the project is installed on *some* platform for the same
version of the project (e.g. the metadata MUST be consistent across all sdists
and wheels for a project release). This does imply that the information isn't
specific to the distribution artifact it is found in, but for the release
version the distribution artifact belongs to.

``Import-Name`` lists import names which a project, when installed, would
*exclusively* provide (i.e. if two projects were installed with the same import
names listed in ``Import-Name``, then one of the projects would shadow the
name for the other). ``Import-Namespace`` lists import names that, when
installed, would be provided by the project, but not exclusively (i.e.
projects all listing the same import name in ``Import-Namespace`` being
installed together would not shadow those shared names).

The :ref:`declaring-project-metadata` will gain an ``import-names`` key. It
will be an array of strings that stores what will be written out to
``Import-Name``. Build back-ends MAY support dynamically calculating the
value on the user's behalf if desired, if the user declares the key in
``project.dynamic``.
``project.dynamic``. The same applies to ``import-namespaces`` for
``Import-Namespace``.

Projects SHOULD list all the shortest import names that are exclusively provided
by a project which would cover all import name scenarios. If any of the shortest
names are dotted names, all intervening names from that name to the top-level
name should also be listed appropriately in ``Import-Namespace``.
For instance, a project which is a single package named ``spam`` with multiple
submodules would only list ``project.import-names = ["spam"]``. A project that
provides ``spam.bacon.eggs`` which is exclusively from the project while the
intervening names are namespaces would have
``project-names = ["spam.bacon.eggs"]`` and
``project-namespaces = ["spam", "spam.bacon"]``. Listing all names acts as a
check that the intent of the import names is as expected.

Tools SHOULD raise an error when two projects that are to be installed list
names that overlap in each others' ``Import-Name`` entries. This is to avoid
projects unexpectedly shadowing another project's code. The same applies to when
a project has an entry in ``Import-Name`` that overlaps with another project's
``Import-Namespace`` entries.

Tools SHOULD raise an error when an entry in ``Import-Name`` is higher than
``Import-Namespace`` in the same project, e.g.
``project.import-names = ["spam"]`` and
``project.import-namespaces = ["spam.bacon"]``. This is because if a project
exclusively owns a higher import name then that would mean it is impossible for
another project to install with the same import name found in ``Import-Name``
in order to contribute to the namespace listed in ``Import-Namespace``.

Projects MAY leave ``Import-Name`` and ``Import-Namespace`` empty. In that
instance, tools SHOULD assume that the normalized project name when converted to
an import name would be an entry in ``Import-Name``
(i.e. ``-`` substituted for ``-`` in the normalized project name).


Examples
--------

`In scikit-learn 1.7.0
<https://pypi-browser.org/package/scikit-learn/scikit_learn-1.7.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl>`__ ,
an entry for the ``sklearn`` package would be used.
For `scikit-learn 1.7.0
<https://pypi-browser.org/package/scikit-learn/scikit_learn-1.7.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl>`__:

.. code-block:: TOML

[project]
import-names = ["sklearn"]

`In pytest 8.3.5
For `pytest 8.3.5
<https://pypi-browser.org/package/pytest/pytest-8.3.5-py3-none-any.whl>`__
there would be 3 expected entries:

1. ``_pytest``
2. ``py``
3. ``pytest``
.. code-block:: TOML

In `azure-mgmt-search 9.1.0
[project]
import-names = ["_pytest", "py", "pytest"]


For `azure-mgmt-search 9.1.0
<https://pypi-browser.org/package/azure-mgmt-search/azure_mgmt_search-9.1.0-py3-none-any.whl>`__,
there should be a single entry for ``azure.mgmt.search``.
there should be two namespace entries and one name entry for
``azure.mgmt.search``:

.. code-block:: TOML

[project]
import-names = ["azure.mgmt.search"]
import-namespaces = ["azure", "azure.mgmt"]


Backwards Compatibility
Expand All @@ -170,13 +217,13 @@ malicious in some way.
How to Teach This
=================

Project owners should be taught that they can now record what namespaces
their project provides. They should be told that if their project has a
non-obvious namespace from the file structure of the project that they should
specify the appropriate information. They should have it explained to them
that they should use the shortest name possible that appropriately explains
what the project provides (i.e. what the specification requires to be
recorded).
Project owners should be taught that they can now record what names their
projects provide for importing. If their project name matches the module or
package name their project provides they don't have to do anything. If there is
a difference, though, they should record all the import names their project
provides, using the shortest names possible. If any of the names are implicit
namespaces, those go into ``project.import-namespaces`` in ``pyproject.toml``,
otherwise the name goes into ``project.import-names``.

Users of projects don't necessarily need to know about this new metadata.
While they may be exposed to it via tooling, the details of where that data
Expand All @@ -196,6 +243,16 @@ https://github.com/brettcannon/packaging/tree/pep-794 is a branch to update
Rejected Ideas
==============

Infer the value for ``Import-Namespace``
----------------------------------------

A previous version of this PEP inferred what would have been the values for
``Import-Namespace`` based on dotted names in ``Import-Name``. It was decided
that it would better to be explicit not only to avoid mistakes by accidentally
listing something that would be interpreted as an implicit namespace, but it
also made the data more self-documenting.


Re-purpose the ``Provides`` field
----------------------------------

Expand Down Expand Up @@ -266,4 +323,3 @@ Copyright

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
CC0-1.0-Universal license, whichever is more permissive.