@@ -30,10 +30,11 @@ Motivation
3030==========
3131
3232CPython has modules for several different compression formats, such as
33- :mod: `zlib (DEFLATE) <zlib> `, :mod: `bzip2 <bz2> `, and :mod: `lzma <lzma> `,
34- each widely used. Including popular compression algorithms matches Python's
35- "batteries included" philosophy of incorporating widely useful standards and
36- utilities. :mod: `!lzma ` is the most recent such module, added in Python 3.3.
33+ :mod: `zlib (DEFLATE) <zlib> `, :mod: `gzip <gzip> `, :mod: `bzip2 <bz2> `, and
34+ :mod: `lzma <lzma> `, each widely used. Including popular compression algorithms
35+ matches Python's "batteries included" philosophy of incorporating widely useful
36+ standards and utilities. :mod: `!lzma ` is the most recent such module, added in
37+ Python 3.3.
3738
3839Since then, Zstandard has become the modern *de facto * preferred compression
3940library for both high performance compression and decompression attaining high
@@ -216,9 +217,10 @@ used to build libraries CPython depends on for Windows.
216217Other compression modules
217218-------------------------
218219
219- New import names ``compression.lzma ``, ``compression.bz2 ``, and
220- ``compression.zlib `` will be introduced in Python 3.14 re-exporting the
221- contents of the existing ``lzma ``, ``bz2 ``, and ``zlib `` modules respectively.
220+ New import names ``compression.lzma ``, ``compression.bz2 ``,
221+ ``compression.gzip `` and ``compression.zlib `` will be introduced in Python 3.14
222+ re-exporting the contents of the existing ``lzma ``, ``bz2 ``, ``gzip `` and
223+ ``zlib `` modules respectively.
222224
223225The ``_compression `` module, given that it is marked private, will be
224226immediately renamed to ``compression._common.streams ``. The new name was
@@ -289,17 +291,80 @@ decision is reached regarding the open issues.
289291Rejected Ideas
290292==============
291293
292- Name the module ``libzstd `` and do not make a new ``compression `` namespace
294+ Name the module ``zstdlib `` and do not make a new ``compression `` namespace
293295---------------------------------------------------------------------------
294296
295297One option instead of making a new ``compression `` namespace would be to find
296- a different name, such as ``libzstd ``, as the import name. However, the issue
297- of existing import names is likely to persist for future compression formats
298- added to the standard library. LZ4, a common high speed compression format,
299- has `a package on PyPI <https://pypi.org/project/lz4/ >`_, ``lz4 ``, with the
300- import name ``lz4 ``. Instead of solving this issue for each compression format,
301- it is better to solve it once and for all by using the already-claimed
302- ``compression `` namespace.
298+ a different name, such as ``zstdlib ``, as the import name. Several other names,
299+ such as ``zst ``, ``libzstd ``, and ``zstdcomp `` were proposed as well. In
300+ discussion, the names were found to either be too easy to typo, or unintuitive.
301+ Furthermore, the issue of existing import names is likely to persist for future
302+ compression formats added to the standard library. LZ4, a common high speed
303+ compression format, has `a package on PyPI <https://pypi.org/project/lz4/ >`_,
304+ ``lz4 ``, with the import name ``lz4 ``. Instead of solving this issue for each
305+ compression format, it is better to solve it once and for all by using the
306+ already-claimed ``compression `` namespace.
307+
308+ Introduce an experimental ``_zstd `` package in Python 3.14
309+ ----------------------------------------------------------
310+
311+ Since this PEP was published close to the beta cutoff for new features for
312+ Python 3.14, one proposal was to name the package a private module ``_zstd ``
313+ so that packaging tools could use it sooner, but not deciding on a name. This
314+ would allow more time for discussion of the final module name during the 3.15
315+ development window. However, introducing a private module was not popular. The
316+ expectations and contract for external usage of a private module in the
317+ standard library are unclear.
318+
319+ Introduce a standard library namespace instead of ``compression ``
320+ -----------------------------------------------------------------
321+
322+ One alternative to a ``compression `` namespace would be to introduce a
323+ ``std `` namespace for the entire standard library. However, this was seen as
324+ too significant a change for 3.14, with no agreed upon semantics, migration
325+ path, or name for the package. Furthermore, a future PEP introducing a ``std ``
326+ namespace could always define that the ``compression `` sub-modules be flattened
327+ into the ``std `` namespace.
328+
329+ Include ``zipfile `` and ``tarfile `` in ``compression ``
330+ ------------------------------------------------------
331+
332+ Compression is often used with archiving tools, so putting both :mod: `zipfile `
333+ and :mod: `tarfile ` under the ``compression `` namespace is appealing. However,
334+ compression can be used beyond just archiving tools. For example, network
335+ requests can be gzip compressed. Furthermore, formats like tar do not include
336+ compression themselves, instead relying on external compression. Therefore,
337+ this PEP does not propose moving :mod: `!zipfile ` or :mod: `!tarfile ` under
338+ ``compression ``.
339+
340+ Do not include ``gzip `` under ``compression ``
341+ ---------------------------------------------
342+
343+ The :rfc: `GZip format RFC <1952 >` defines a format which can include multiple
344+ blocks and metadata about its contents. In this way GZip is rather similar to
345+ archive formats like ZIP and tar. Despite that, in usage GZip is often treated
346+ as a compression format rather than an archive format. Looking at how different
347+ languages classify GZip, the prevailing trend is to classify it as a
348+ compression format and not an archiving format.
349+
350+ ========== ======================== ==============================================================================
351+ Language Compression or Archive Documentation Link
352+ ========== ======================== ==============================================================================
353+ Golang Compression https://pkg.go.dev/compress/gzip
354+ Ruby Compression https://docs.ruby-lang.org/en/master/Zlib/GzipFile.html
355+ Rust Compression https://github.com/rust-lang/flate2-rs
356+ Haskell Compression https://hackage.haskell.org/package/zlib
357+ C# Compression https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.gzipstream
358+ Java Archive https://docs.oracle.com/javase/8/docs/api/java/util/zip/package-summary.html
359+ NodeJS Compression https://nodejs.org/api/zlib.html
360+ Web APIs Compression https://developer.mozilla.org/en-US/docs/Web/API/Compression_Streams_API
361+ PHP Compression https://www.php.net/manual/en/function.gzcompress.php
362+ Perl Compression https://perldoc.perl.org/IO::Compress::Gzip
363+ ========== ======================== ==============================================================================
364+
365+ In addition, the :mod: `!gzip ` module in Python mostly focuses on single block
366+ content and has an API similar to other compression modules, making it a good
367+ fit for the ``compression `` namespace.
303368
304369
305370Copyright
0 commit comments