Skip to content

Commit a56e6be

Browse files
committed
gh-62944: Add performance note for out-of-order extraction from compressed archives
Extracting members in a different order than they appear in a compressed tarfile requires re-decompressing from the beginning of the stream for each backward seek. Add a note to tarfile.open() documenting this and recommending in-order extraction or use of TarFile.extractall() for best performance.
1 parent 29a920e commit a56e6be

1 file changed

Lines changed: 11 additions & 0 deletions

File tree

Doc/library/tarfile.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,17 @@ Some facts and figures:
123123
:exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a
124124
compression method is not supported, :exc:`CompressionError` is raised.
125125

126+
.. note::
127+
128+
Compressed archives opened with modes like ``'r:gz'``, ``'r:bz2'``,
129+
``'r:xz'``, or ``'r:zst'`` support random access, but seeking backwards
130+
in the underlying compressed stream requires re-decompressing from the
131+
beginning. Extracting members in a different order than they appear in
132+
the archive can therefore be significantly slower — proportional to the
133+
total compressed data read rather than just the target member's size. For
134+
best performance, extract members in archive order or use
135+
:meth:`TarFile.extractall`.
136+
126137
If *fileobj* is specified, it is used as an alternative to a :term:`file object`
127138
opened in binary mode for *name*. It is supposed to be at position 0.
128139

0 commit comments

Comments
 (0)