Skip to content

Conversation

@tarcieri
Copy link
Member

Note: version bumped to v0.5.0-pre to denote breaking change (not for release)

Perhaps the first and foremost use case for a crate like this (or subtle or ctutils) is comparing byte slices in constant-time, however the existing codegen for this is bad, because it goes a byte-at-a-time, converting them to a u32oru64`, then emitting predication instructions (or using bitwise masking) on each individual byte.

Instead this removes the CmovEq impl for [T] and replaces it with an optimized impl of CmovEq for [u8], reusing the code for the optimized CmovEq impl for arrays added in #1353.

This approach goes in word-sized chunks of the slice, converting them to a word-sized integer (u32 or u64) and using the CmovEq impl on those types, which should result in much more efficient code.

With this change all of the slice chunking code is now in the slice module, which lets us move the vendored copies of [T]::as_chunks(_mut) there, get rid of a utils module, and rename it back to macros (though that's perhaps a misnomer as it contains only one macro).

A small change to the Cmov impl added in #1354: it panics if the input sizes aren't equal, using the same panic message as copy_from_slice.

@tarcieri tarcieri force-pushed the ctutils/optimized-cmoveq-for-byte-slices branch 5 times, most recently from c844f00 to e3d1c1d Compare January 16, 2026 18:06
Note: version bumped to v0.5.0-pre to denote breaking change
(not for release)

Perhaps the first and foremost use case for a crate like this (or
`subtle` or `ctutils) is comparing byte slices in constant-time, however
the existing codegen for this is bad, because it goes a byte-at-a-time,
converting them to a `u32` or `u64`, then emitting predication
instructions (or using bitwise masking) on each individual byte.

Instead this removes the `CmovEq` impl for `[T]` and replaces it with an
optimized impl of `CmovEq` for `[u8]`, reusing the code for the
optimized `CmovEq` impl for arrays added in #1353.

This approach goes in word-sized chunks of the slice, converting them to
a word-sized integer (`u32` or `u64`) and using the `CmovEq` impl on
those types, which should result in much more efficient code.

With this change all of the slice chunking code is now in the `slice`
module, which lets us move the vendored copies of `[T]::as_chunks(_mut)`
there, get rid of a `utils` module, and rename it back to `macros`
(though that's perhaps a misnomer as it contains only one macro).

A small change to the `Cmov` impl added in #1354: it panics if the input
sizes aren't equal, using the same panic message as `copy_from_slice`.
@tarcieri tarcieri force-pushed the ctutils/optimized-cmoveq-for-byte-slices branch from e3d1c1d to b795962 Compare January 16, 2026 18:43
@tarcieri tarcieri merged commit 19e042a into master Jan 16, 2026
117 checks passed
@tarcieri tarcieri deleted the ctutils/optimized-cmoveq-for-byte-slices branch January 16, 2026 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants