Skip to content

Commit a471a32

Browse files
gh-143214: Add the wrapcol parameter in binascii.b2a_base64() and base64.b64encode() (GH-143216)
1 parent e370c8d commit a471a32

17 files changed

+273
-104
lines changed

Doc/library/base64.rst

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ The :rfc:`4648` encodings are suitable for encoding binary data so that it can b
5151
safely sent by email, used as parts of URLs, or included as part of an HTTP
5252
POST request.
5353

54-
.. function:: b64encode(s, altchars=None)
54+
.. function:: b64encode(s, altchars=None, *, wrapcol=0)
5555

5656
Encode the :term:`bytes-like object` *s* using Base64 and return the encoded
5757
:class:`bytes`.
@@ -61,9 +61,16 @@ POST request.
6161
This allows an application to e.g. generate URL or filesystem safe Base64
6262
strings. The default is ``None``, for which the standard Base64 alphabet is used.
6363

64+
If *wrapcol* is non-zero, insert a newline (``b'\n'``) character
65+
after at most every *wrapcol* characters.
66+
If *wrapcol* is zero (default), do not insert any newlines.
67+
6468
May assert or raise a :exc:`ValueError` if the length of *altchars* is not 2. Raises a
6569
:exc:`TypeError` if *altchars* is not a :term:`bytes-like object`.
6670

71+
.. versionchanged:: next
72+
Added the *wrapcol* parameter.
73+
6774

6875
.. function:: b64decode(s, altchars=None, validate=False)
6976

@@ -214,9 +221,9 @@ Refer to the documentation of the individual functions for more information.
214221
instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This
215222
feature is not supported by the "standard" Ascii85 encoding.
216223

217-
*wrapcol* controls whether the output should have newline (``b'\n'``)
218-
characters added to it. If this is non-zero, each output line will be
219-
at most this many characters long, excluding the trailing newline.
224+
If *wrapcol* is non-zero, insert a newline (``b'\n'``) character
225+
after at most every *wrapcol* characters.
226+
If *wrapcol* is zero (default), do not insert any newlines.
220227

221228
*pad* controls whether the input is padded to a multiple of 4
222229
before encoding. Note that the ``btoa`` implementation always pads.

Doc/library/binascii.rst

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ The :mod:`binascii` module defines the following functions:
5858

5959
Valid base64:
6060

61-
* Conforms to :rfc:`3548`.
61+
* Conforms to :rfc:`4648`.
6262
* Contains only characters from the base64 alphabet.
6363
* Contains no excess data after padding (including excess padding, newlines, etc.).
6464
* Does not start with a padding.
@@ -67,15 +67,24 @@ The :mod:`binascii` module defines the following functions:
6767
Added the *strict_mode* parameter.
6868

6969

70-
.. function:: b2a_base64(data, *, newline=True)
70+
.. function:: b2a_base64(data, *, wrapcol=0, newline=True)
7171

72-
Convert binary data to a line of ASCII characters in base64 coding. The return
73-
value is the converted line, including a newline char if *newline* is
74-
true. The output of this function conforms to :rfc:`3548`.
72+
Convert binary data to a line(s) of ASCII characters in base64 coding,
73+
as specified in :rfc:`4648`.
74+
75+
If *wrapcol* is non-zero, insert a newline (``b'\n'``) character
76+
after at most every *wrapcol* characters.
77+
If *wrapcol* is zero (default), do not insert any newlines.
78+
79+
If *newline* is true (default), a newline character will be added
80+
at the end of the output.
7581

7682
.. versionchanged:: 3.6
7783
Added the *newline* parameter.
7884

85+
.. versionchanged:: next
86+
Added the *wrapcol* parameter.
87+
7988

8089
.. function:: a2b_qp(data, header=False)
8190

Doc/whatsnew/3.15.rst

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -435,12 +435,22 @@ argparse
435435
inline code when color output is enabled.
436436
(Contributed by Savannah Ostrowski in :gh:`142390`.)
437437

438-
base64 & binascii
439-
-----------------
438+
base64
439+
------
440+
441+
* Added the *pad* parameter in :func:`~base64.z85encode`.
442+
(Contributed by Hauke Dämpfling in :gh:`143103`.)
443+
444+
* Added the *wrapcol* parameter in :func:`~base64.b64encode`.
445+
(Contributed by Serhiy Storchaka in :gh:`143214`.)
446+
447+
448+
binascii
449+
--------
450+
451+
* Added the *wrapcol* parameter in :func:`~binascii.b2a_base64`.
452+
(Contributed by Serhiy Storchaka in :gh:`143214`.)
440453

441-
* CPython's underlying base64 implementation now encodes 2x faster and decodes 3x
442-
faster thanks to simple CPU pipelining optimizations.
443-
(Contributed by Gregory P. Smith & Serhiy Storchaka in :gh:`143262`.)
444454

445455
calendar
446456
--------
@@ -878,6 +888,13 @@ Optimizations
878888
(Contributed by Chris Eibl, Ken Jin, and Brandt Bucher in :gh:`143068`.
879889
Special thanks to the MSVC team including Hulon Jenkins.)
880890

891+
base64 & binascii
892+
-----------------
893+
894+
* CPython's underlying base64 implementation now encodes 2x faster and decodes 3x
895+
faster thanks to simple CPU pipelining optimizations.
896+
(Contributed by Gregory P. Smith and Serhiy Storchaka in :gh:`143262`.)
897+
881898
csv
882899
---
883900

Include/internal/pycore_global_objects_fini_generated.h

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_global_strings.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -865,6 +865,7 @@ struct _Py_global_strings {
865865
STRUCT_FOR_ID(which)
866866
STRUCT_FOR_ID(who)
867867
STRUCT_FOR_ID(withdata)
868+
STRUCT_FOR_ID(wrapcol)
868869
STRUCT_FOR_ID(writable)
869870
STRUCT_FOR_ID(write)
870871
STRUCT_FOR_ID(write_through)

Include/internal/pycore_runtime_init_generated.h

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Include/internal/pycore_unicodeobject_generated.h

Lines changed: 4 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Lib/base64.py

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -45,14 +45,17 @@ def _bytes_from_decode_data(s):
4545

4646
# Base64 encoding/decoding uses binascii
4747

48-
def b64encode(s, altchars=None):
48+
def b64encode(s, altchars=None, *, wrapcol=0):
4949
"""Encode the bytes-like object s using Base64 and return a bytes object.
5050
5151
Optional altchars should be a byte string of length 2 which specifies an
5252
alternative alphabet for the '+' and '/' characters. This allows an
5353
application to e.g. generate url or filesystem safe Base64 strings.
54+
55+
If wrapcol is non-zero, insert a newline (b'\\n') character after at most
56+
every wrapcol characters.
5457
"""
55-
encoded = binascii.b2a_base64(s, newline=False)
58+
encoded = binascii.b2a_base64(s, wrapcol=wrapcol, newline=False)
5659
if altchars is not None:
5760
assert len(altchars) == 2, repr(altchars)
5861
return encoded.translate(bytes.maketrans(b'+/', altchars))
@@ -327,9 +330,8 @@ def a85encode(b, *, foldspaces=False, wrapcol=0, pad=False, adobe=False):
327330
instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This
328331
feature is not supported by the "standard" Adobe encoding.
329332
330-
wrapcol controls whether the output should have newline (b'\\n') characters
331-
added to it. If this is non-zero, each output line will be at most this
332-
many characters long, excluding the trailing newline.
333+
If wrapcol is non-zero, insert a newline (b'\\n') character after at most
334+
every wrapcol characters.
333335
334336
pad controls whether the input is padded to a multiple of 4 before
335337
encoding. Note that the btoa implementation always pads.
@@ -566,11 +568,10 @@ def encodebytes(s):
566568
"""Encode a bytestring into a bytes object containing multiple lines
567569
of base-64 data."""
568570
_input_type_check(s)
569-
pieces = []
570-
for i in range(0, len(s), MAXBINSIZE):
571-
chunk = s[i : i + MAXBINSIZE]
572-
pieces.append(binascii.b2a_base64(chunk))
573-
return b"".join(pieces)
571+
result = binascii.b2a_base64(s, wrapcol=MAXLINESIZE)
572+
if result == b'\n':
573+
return b''
574+
return result
574575

575576

576577
def decodebytes(s):

Lib/email/base64mime.py

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -83,16 +83,15 @@ def body_encode(s, maxlinelen=76, eol=NL):
8383
if not s:
8484
return ""
8585

86-
encvec = []
87-
max_unencoded = maxlinelen * 3 // 4
88-
for i in range(0, len(s), max_unencoded):
89-
# BAW: should encode() inherit b2a_base64()'s dubious behavior in
90-
# adding a newline to the encoded string?
91-
enc = b2a_base64(s[i:i + max_unencoded]).decode("ascii")
92-
if enc.endswith(NL) and eol != NL:
93-
enc = enc[:-1] + eol
94-
encvec.append(enc)
95-
return EMPTYSTRING.join(encvec)
86+
if not eol:
87+
return b2a_base64(s, newline=False).decode("ascii")
88+
89+
# BAW: should encode() inherit b2a_base64()'s dubious behavior in
90+
# adding a newline to the encoded string?
91+
enc = b2a_base64(s, wrapcol=maxlinelen).decode("ascii")
92+
if eol != NL:
93+
enc = enc.replace(NL, eol)
94+
return enc
9695

9796

9897
def decode(string):

Lib/email/contentmanager.py

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -129,19 +129,6 @@ def _finalize_set(msg, disposition, filename, cid, params):
129129
msg.set_param(key, value)
130130

131131

132-
# XXX: This is a cleaned-up version of base64mime.body_encode (including a bug
133-
# fix in the calculation of unencoded_bytes_per_line). It would be nice to
134-
# drop both this and quoprimime.body_encode in favor of enhanced binascii
135-
# routines that accepted a max_line_length parameter.
136-
def _encode_base64(data, max_line_length):
137-
encoded_lines = []
138-
unencoded_bytes_per_line = max_line_length // 4 * 3
139-
for i in range(0, len(data), unencoded_bytes_per_line):
140-
thisline = data[i:i+unencoded_bytes_per_line]
141-
encoded_lines.append(binascii.b2a_base64(thisline).decode('ascii'))
142-
return ''.join(encoded_lines)
143-
144-
145132
def _encode_text(string, charset, cte, policy):
146133
# If max_line_length is 0 or None, there is no limit.
147134
maxlen = policy.max_line_length or sys.maxsize
@@ -176,7 +163,7 @@ def normal_body(lines): return b'\n'.join(lines) + b'\n'
176163
data = quoprimime.body_encode(normal_body(lines).decode('latin-1'),
177164
maxlen)
178165
elif cte == 'base64':
179-
data = _encode_base64(embedded_body(lines), maxlen)
166+
data = binascii.b2a_base64(embedded_body(lines), wrapcol=maxlen).decode('ascii')
180167
else:
181168
raise ValueError("Unknown content transfer encoding {}".format(cte))
182169
return cte, data
@@ -234,7 +221,8 @@ def set_bytes_content(msg, data, maintype, subtype, cte='base64',
234221
params=None, headers=None):
235222
_prepare_set(msg, maintype, subtype, headers)
236223
if cte == 'base64':
237-
data = _encode_base64(data, max_line_length=msg.policy.max_line_length)
224+
data = binascii.b2a_base64(data, wrapcol=msg.policy.max_line_length)
225+
data = data.decode('ascii')
238226
elif cte == 'quoted-printable':
239227
# XXX: quoprimime.body_encode won't encode newline characters in data,
240228
# so we can't use it. This means max_line_length is ignored. Another

0 commit comments

Comments
 (0)