Skip to content

Commit a966d94

Browse files
authored
gh-144157: Optimize bytes.translate() by deferring change detection (GH-144158)
Optimize bytes.translate() by deferring change detection Move the equality check out of the hot loop to allow better compiler optimization. Instead of checking each byte during translation, perform a single memcmp at the end to determine if the input can be returned unchanged. This allows compilers to unroll and pipeline the loops, resulting in ~2x throughput improvement for medium-to-large inputs (tested on an AMD zen2). No change observed on small inputs. It will also be faster for bytes subclasses as those do not need change detection.
1 parent 77bf4ba commit a966d94

File tree

2 files changed

+11
-5
lines changed

2 files changed

+11
-5
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
:meth:`bytes.translate` now allows the compiler to unroll its loop more
2+
usefully for a 2x speedup in the common no-deletions specified case.

Objects/bytesobject.c

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2237,11 +2237,15 @@ bytes_translate_impl(PyBytesObject *self, PyObject *table,
22372237
/* If no deletions are required, use faster code */
22382238
for (i = inlen; --i >= 0; ) {
22392239
c = Py_CHARMASK(*input++);
2240-
if (Py_CHARMASK((*output++ = table_chars[c])) != c)
2241-
changed = 1;
2242-
}
2243-
if (!changed && PyBytes_CheckExact(input_obj)) {
2244-
Py_SETREF(result, Py_NewRef(input_obj));
2240+
*output++ = table_chars[c];
2241+
}
2242+
/* Check if anything changed (for returning original object) */
2243+
/* We save this check until the end so that the compiler will */
2244+
/* unroll the loop above leading to MUCH faster code. */
2245+
if (PyBytes_CheckExact(input_obj)) {
2246+
if (memcmp(PyBytes_AS_STRING(input_obj), output_start, inlen) == 0) {
2247+
Py_SETREF(result, Py_NewRef(input_obj));
2248+
}
22452249
}
22462250
PyBuffer_Release(&del_table_view);
22472251
PyBuffer_Release(&table_view);

0 commit comments

Comments
 (0)