diff --git a/peps/pep-0784.rst b/peps/pep-0784.rst index 06326aae994..cd0b9589066 100644 --- a/peps/pep-0784.rst +++ b/peps/pep-0784.rst @@ -30,10 +30,11 @@ Motivation ========== CPython has modules for several different compression formats, such as -:mod:`zlib (DEFLATE) `, :mod:`bzip2 `, and :mod:`lzma `, -each widely used. Including popular compression algorithms matches Python's -"batteries included" philosophy of incorporating widely useful standards and -utilities. :mod:`!lzma` is the most recent such module, added in Python 3.3. +:mod:`zlib (DEFLATE) `, :mod:`gzip `, :mod:`bzip2 `, and +:mod:`lzma `, each widely used. Including popular compression algorithms +matches Python's "batteries included" philosophy of incorporating widely useful +standards and utilities. :mod:`!lzma` is the most recent such module, added in +Python 3.3. Since then, Zstandard has become the modern *de facto* preferred compression library for both high performance compression and decompression attaining high @@ -216,9 +217,10 @@ used to build libraries CPython depends on for Windows. Other compression modules ------------------------- -New import names ``compression.lzma``, ``compression.bz2``, and -``compression.zlib`` will be introduced in Python 3.14 re-exporting the -contents of the existing ``lzma``, ``bz2``, and ``zlib`` modules respectively. +New import names ``compression.lzma``, ``compression.bz2``, +``compression.gzip`` and ``compression.zlib`` will be introduced in Python 3.14 +re-exporting the contents of the existing ``lzma``, ``bz2``, ``gzip`` and +``zlib`` modules respectively. The ``_compression`` module, given that it is marked private, will be immediately renamed to ``compression._common.streams``. The new name was @@ -289,17 +291,80 @@ decision is reached regarding the open issues. Rejected Ideas ============== -Name the module ``libzstd`` and do not make a new ``compression`` namespace +Name the module ``zstdlib`` and do not make a new ``compression`` namespace --------------------------------------------------------------------------- One option instead of making a new ``compression`` namespace would be to find -a different name, such as ``libzstd``, as the import name. However, the issue -of existing import names is likely to persist for future compression formats -added to the standard library. LZ4, a common high speed compression format, -has `a package on PyPI `_, ``lz4``, with the -import name ``lz4``. Instead of solving this issue for each compression format, -it is better to solve it once and for all by using the already-claimed -``compression`` namespace. +a different name, such as ``zstdlib``, as the import name. Several other names, +such as ``zst``, ``libzstd``, and ``zstdcomp`` were proposed as well. In +discussion, the names were found to either be too easy to typo, or unintuitive. +Furthermore, the issue of existing import names is likely to persist for future +compression formats added to the standard library. LZ4, a common high speed +compression format, has `a package on PyPI `_, +``lz4``, with the import name ``lz4``. Instead of solving this issue for each +compression format, it is better to solve it once and for all by using the +already-claimed ``compression`` namespace. + +Introduce an experimental ``_zstd`` package in Python 3.14 +---------------------------------------------------------- + +Since this PEP was published close to the beta cutoff for new features for +Python 3.14, one proposal was to name the package a private module ``_zstd`` +so that packaging tools could use it sooner, but not deciding on a name. This +would allow more time for discussion of the final module name during the 3.15 +development window. However, introducing a private module was not popular. The +expectations and contract for external usage of a private module in the +standard library are unclear. + +Introduce a standard library namespace instead of ``compression`` +----------------------------------------------------------------- + +One alternative to a ``compression`` namespace would be to introduce a +``std`` namespace for the entire standard library. However, this was seen as +too significant a change for 3.14, with no agreed upon semantics, migration +path, or name for the package. Furthermore, a future PEP introducing a ``std`` +namespace could always define that the ``compression`` sub-modules be flattened +into the ``std`` namespace. + +Include ``zipfile`` and ``tarfile`` in ``compression`` +------------------------------------------------------ + +Compression is often used with archiving tools, so putting both :mod:`zipfile` +and :mod:`tarfile` under the ``compression`` namespace is appealing. However, +compression can be used beyond just archiving tools. For example, network +requests can be gzip compressed. Furthermore, formats like tar do not include +compression themselves, instead relying on external compression. Therefore, +this PEP does not propose moving :mod:`!zipfile` or :mod:`!tarfile` under +``compression``. + +Do not include ``gzip`` under ``compression`` +--------------------------------------------- + +The :rfc:`GZip format RFC <1952>` defines a format which can include multiple +blocks and metadata about its contents. In this way GZip is rather similar to +archive formats like ZIP and tar. Despite that, in usage GZip is often treated +as a compression format rather than an archive format. Looking at how different +languages classify GZip, the prevailing trend is to classify it as a +compression format and not an archiving format. + +========== ======================== ============================================================================== +Language Compression or Archive Documentation Link +========== ======================== ============================================================================== +Golang Compression https://pkg.go.dev/compress/gzip +Ruby Compression https://docs.ruby-lang.org/en/master/Zlib/GzipFile.html +Rust Compression https://github.com/rust-lang/flate2-rs +Haskell Compression https://hackage.haskell.org/package/zlib +C# Compression https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.gzipstream +Java Archive https://docs.oracle.com/javase/8/docs/api/java/util/zip/package-summary.html +NodeJS Compression https://nodejs.org/api/zlib.html +Web APIs Compression https://developer.mozilla.org/en-US/docs/Web/API/Compression_Streams_API +PHP Compression https://www.php.net/manual/en/function.gzcompress.php +Perl Compression https://perldoc.perl.org/IO::Compress::Gzip +========== ======================== ============================================================================== + +In addition, the :mod:`!gzip` module in Python mostly focuses on single block +content and has an API similar to other compression modules, making it a good +fit for the ``compression`` namespace. Copyright