From f4ba4c5b2fa9128db020c5d9f93b23bad0bc10e3 Mon Sep 17 00:00:00 2001 From: Emma Harper Smith Date: Mon, 14 Apr 2025 18:20:44 -0700 Subject: [PATCH 1/2] Update PEP to reflect inclusion of gzip and discussed rejected ideas --- peps/pep-0784.rst | 93 +++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 78 insertions(+), 15 deletions(-) diff --git a/peps/pep-0784.rst b/peps/pep-0784.rst index 06326aae994..c37899ffbd6 100644 --- a/peps/pep-0784.rst +++ b/peps/pep-0784.rst @@ -30,10 +30,11 @@ Motivation ========== CPython has modules for several different compression formats, such as -:mod:`zlib (DEFLATE) `, :mod:`bzip2 `, and :mod:`lzma `, -each widely used. Including popular compression algorithms matches Python's -"batteries included" philosophy of incorporating widely useful standards and -utilities. :mod:`!lzma` is the most recent such module, added in Python 3.3. +:mod:`zlib (DEFLATE) `, :mod:`gzip `, :mod:`bzip2 `, and +:mod:`lzma `, each widely used. Including popular compression algorithms +matches Python's "batteries included" philosophy of incorporating widely useful +standards and utilities. :mod:`!lzma` is the most recent such module, added in +Python 3.3. Since then, Zstandard has become the modern *de facto* preferred compression library for both high performance compression and decompression attaining high @@ -216,9 +217,10 @@ used to build libraries CPython depends on for Windows. Other compression modules ------------------------- -New import names ``compression.lzma``, ``compression.bz2``, and -``compression.zlib`` will be introduced in Python 3.14 re-exporting the -contents of the existing ``lzma``, ``bz2``, and ``zlib`` modules respectively. +New import names ``compression.lzma``, ``compression.bz2``, +``compression.gzip`` and ``compression.zlib`` will be introduced in Python 3.14 +re-exporting the contents of the existing ``lzma``, ``bz2``, ``gzip`` and +``zlib`` modules respectively. The ``_compression`` module, given that it is marked private, will be immediately renamed to ``compression._common.streams``. The new name was @@ -289,17 +291,78 @@ decision is reached regarding the open issues. Rejected Ideas ============== -Name the module ``libzstd`` and do not make a new ``compression`` namespace +Name the module ``zstdlib`` and do not make a new ``compression`` namespace --------------------------------------------------------------------------- One option instead of making a new ``compression`` namespace would be to find -a different name, such as ``libzstd``, as the import name. However, the issue -of existing import names is likely to persist for future compression formats -added to the standard library. LZ4, a common high speed compression format, -has `a package on PyPI `_, ``lz4``, with the -import name ``lz4``. Instead of solving this issue for each compression format, -it is better to solve it once and for all by using the already-claimed -``compression`` namespace. +a different name, such as ``zstdlib``, as the import name. Several other names, +such as ``zst``, ``libzstd``, and ``zstdcomp`` were proposed as well. In +discussion, the names were found to either be too easy to typo, or unintuitive. +Furthermore, the issue of existing import names is likely to persist for future +compression formats added to the standard library. LZ4, a common high speed +compression format, has `a package on PyPI `_, +``lz4``, with the import name ``lz4``. Instead of solving this issue for each +compression format, it is better to solve it once and for all by using the +already-claimed ``compression`` namespace. + +Introduce an experimental ``_zstd`` package in Python 3.14 +---------------------------------------------------------- + +Since this PEP was published close to the beta cutoff for new features for +Python 3.14, one proposal was to name the package a private module ``_zstd`` +so that packaging tools could use it sooner, but not deciding on a name. This +would allow more time for discussion of the final module name during the 3.15 +development window. However, introducing a private module was not popular. The +expectations and contract for external usage of a private module in the +standard library are unclear. + +Name the module ``std.zstd`` or some other standard library namespace +--------------------------------------------------------------------- + +One alternative to a ``compression`` namespace would be to introduce a +``std`` namespace for the entire standard library. However, this was seen as +too significant a change for 3.14, with no agreed upon semantics, migration +path, or name for the package. Furthermore, a future PEP introducing a ``std`` +namespace could always define that the ``compression`` sub-modules be flattened +into the ``std`` namespace. + +Include ``zipfile`` and ``tarfile`` in ``compression`` +------------------------------------------------------ + +Compression is often used with archiving tools, so putting both ``zipfile`` and +``tarfile`` under the ``compression`` namespace is appealing. However, +compression can be used beyond just archiving tools. For example, network +requests can be gzip compressed. Furthermore, formats like tar do not include +compression themselves, instead relying on external compression. + +Do not include ``gzip`` under ``compression`` +--------------------------------------------- + +The :rfc:`GZip format RFC <1952>` defines a format which can include multiple +blocks and metadata about its contents. In this way GZip is rather similar to +archive formats like ZIP and tar. Despite that, in usage GZip is often treated +as a compression format rather than an archive format. Looking at how different +languages classify GZip, the prevailing trend is to classify it as a +compression format and not an archiving format. + +========== ======================== ============================================================================== +Language Compression or Archive Documentation Link +========== ======================== ============================================================================== +Golang Compression https://pkg.go.dev/compress/gzip +Ruby Compression https://docs.ruby-lang.org/en//master/Zlib/GzipFile.html +Rust Compression https://github.com/rust-lang/flate2-rs +Haskell Compression https://hackage.haskell.org/package/zlib +C# Compression https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.gzipstream +Java Archive https://docs.oracle.com/javase/8/docs/api/java/util/zip/package-summary.html +NodeJS Compression https://nodejs.org/api/zlib.html +Web APIs Compression https://developer.mozilla.org/en-US/docs/Web/API/Compression_Streams_API +PHP Compression https://www.php.net/manual/en/function.gzcompress.php +Perl Compression https://perldoc.perl.org/IO::Compress::Gzip +========== ======================== ============================================================================== + +In addition, the :mod:`!gzip` module in Python mostly focuses on single block +content and has an API similar to other compression modules, making it a good +fit for the ``compression`` namespace. Copyright From c4f9d7d80e9db916eb5828a1ae843f1714d01591 Mon Sep 17 00:00:00 2001 From: Emma Harper Smith Date: Mon, 14 Apr 2025 18:48:29 -0700 Subject: [PATCH 2/2] Minor formatating and wording improvements --- peps/pep-0784.rst | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/peps/pep-0784.rst b/peps/pep-0784.rst index c37899ffbd6..cd0b9589066 100644 --- a/peps/pep-0784.rst +++ b/peps/pep-0784.rst @@ -316,8 +316,8 @@ development window. However, introducing a private module was not popular. The expectations and contract for external usage of a private module in the standard library are unclear. -Name the module ``std.zstd`` or some other standard library namespace ---------------------------------------------------------------------- +Introduce a standard library namespace instead of ``compression`` +----------------------------------------------------------------- One alternative to a ``compression`` namespace would be to introduce a ``std`` namespace for the entire standard library. However, this was seen as @@ -329,11 +329,13 @@ into the ``std`` namespace. Include ``zipfile`` and ``tarfile`` in ``compression`` ------------------------------------------------------ -Compression is often used with archiving tools, so putting both ``zipfile`` and -``tarfile`` under the ``compression`` namespace is appealing. However, +Compression is often used with archiving tools, so putting both :mod:`zipfile` +and :mod:`tarfile` under the ``compression`` namespace is appealing. However, compression can be used beyond just archiving tools. For example, network requests can be gzip compressed. Furthermore, formats like tar do not include -compression themselves, instead relying on external compression. +compression themselves, instead relying on external compression. Therefore, +this PEP does not propose moving :mod:`!zipfile` or :mod:`!tarfile` under +``compression``. Do not include ``gzip`` under ``compression`` --------------------------------------------- @@ -349,7 +351,7 @@ compression format and not an archiving format. Language Compression or Archive Documentation Link ========== ======================== ============================================================================== Golang Compression https://pkg.go.dev/compress/gzip -Ruby Compression https://docs.ruby-lang.org/en//master/Zlib/GzipFile.html +Ruby Compression https://docs.ruby-lang.org/en/master/Zlib/GzipFile.html Rust Compression https://github.com/rust-lang/flate2-rs Haskell Compression https://hackage.haskell.org/package/zlib C# Compression https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.gzipstream