Skip to content

crypto: skcipher - add per-tfm data_unit_size for batched requests#853

Open
blktests-ci[bot] wants to merge 4 commits into
linus-master_basefrom
series/1097344=>linus-master
Open

crypto: skcipher - add per-tfm data_unit_size for batched requests#853
blktests-ci[bot] wants to merge 4 commits into
linus-master_basefrom
series/1097344=>linus-master

Conversation

@blktests-ci
Copy link
Copy Markdown

@blktests-ci blktests-ci Bot commented May 19, 2026

Pull request for series with
subject: crypto: skcipher - add per-tfm data_unit_size for batched requests
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1097344

Leonid Ravich added 4 commits May 19, 2026 12:09
Add a per-tfm data_unit_size and an algorithm capability flag that
together allow a caller to submit several data units in a single
skcipher request.  The IV passed in the request applies to the first
data unit; the algorithm advances the tweak between data units
according to the mode specification (e.g., LE128 multiply for XTS per
IEEE 1619).

This mirrors the data_unit_size concept already exposed by
struct blk_crypto_config for inline encryption hardware, but at the
software skcipher layer.  The first user is dm-crypt, which today
issues one request per sector and so pays a per-sector cost in
request allocation, IV generation, callback dispatch, and completion
handling.  Allowing the cipher to consume a whole bio per request
removes that overhead for drivers that can chain across data units
internally.

The data_unit_size lives on struct crypto_skcipher rather than on
struct skcipher_request because it does not change between requests
for any plausible consumer: dm-crypt picks one sector size per
mapped target at table load time; fscrypt would pick one per master
key.  Anchoring it to the tfm also lets the driver validate it once
at setkey() time and avoids per-request initialisation hazards on
mempool-recycled requests.

Capability is advertised with CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT
in cra_flags (type-specific high-byte range, mirroring the
CRYPTO_AHASH_ALG_* convention).  This makes the capability visible
in /proc/crypto and lets templates OR it into their derived
algorithms.

crypto_skcipher_set_data_unit_size() returns -EOPNOTSUPP if the
algorithm does not advertise the flag, and accepts 0 (the default)
unconditionally so callers can re-disable batching cheaply.

crypto_skcipher_encrypt()/decrypt() reject requests whose cryptlen
is not a multiple of the configured data_unit_size with -EINVAL.
The check is gated on data_unit_size != 0 so it costs nothing for
the common single-data-unit case.

No in-tree algorithm advertises the flag yet; subsequent patches
add the generic xts() template, arm64, and x86 producers as well
as the dm-crypt consumer.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
Teach the generic xts() template to consume cryptlen larger than one
data unit when the caller has configured a non-zero data_unit_size on
the tfm.  Each data unit is processed with its own IV, derived from
the caller-supplied IV by treating it as a 128-bit little-endian
counter and adding the data-unit index.  This matches the
sector-indexed XTS used by dm-crypt's plain64 IV mode and by typical
inline-encryption hardware.

The single-data-unit body is unchanged and is now reached via a thin
xts_crypt_multi() dispatcher that skips straight to the body when
data_unit_size is zero (the legacy default), so existing users see
no extra cost.

Advertise CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags only when
the inner cipher is synchronous.  An async inner cipher would require
a per-DU completion chain which is out of scope for the slow software
template; consumers that need multi-DU on async hardware will use one
of the arch-specific drivers added later in this series.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
Add a self-comparison test that runs whenever an skcipher algorithm
advertises CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags.  The test
encrypts the same random plaintext two ways:

  1. as one batched request with data_unit_size set, and
  2. as N back-to-back single-data-unit requests with IVs derived from
     the original IV by adding the data-unit index (treated as a
     128-bit little-endian counter, matching the convention documented
     in crypto_skcipher_set_data_unit_size()).

Both encrypts must produce byte-identical ciphertext, otherwise the
algorithm's multi-DU implementation is inconsistent with its single-DU
behaviour.  Iterates over a fixed set of typical data unit sizes
(512, 1024, 2048, 4096) which cover the dm-crypt sector-size range.

The test is gated on ivsize == 16 (XTS, the only multi-DU consumer in
the kernel today) and on the algorithm advertising the capability,
so it costs nothing for the existing fleet of skcipher drivers.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
When the underlying skcipher driver advertises support for multiple
data units in a single request (CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT),
configure the cipher with cc->sector_size as data_unit_size and
submit one request per bio instead of one request per sector.  This
removes per-sector overhead in the crypto API hot path: request
allocation, callback dispatch, completion handling, and SG setup.

The optimisation is enabled automatically at table load when all
of the following hold:

 - the cipher is non-aead (i.e. skcipher);
 - tfms_count is 1 (interleaved per-sector keys would break batching);
 - the IV mode is plain or plain64 (the only modes whose generator
   produces a sequential 64-bit little-endian counter that the cipher
   can extend by adding the data-unit index, matching the convention
   documented in crypto_skcipher_set_data_unit_size());
 - the iv_gen_ops->post() hook is unset (lmk and tcw use it; both are
   already excluded by the IV-mode test, but the explicit check makes
   the assumption durable against future IV modes);
 - dm-integrity is not stacked (no integrity tag or integrity IV);
 - the cipher driver advertises multi-data-unit support.

A new CRYPT_MULTI_DATA_UNIT cipher_flag, set once at construction
time, gates the multi-data-unit path.  The existing per-sector path
in crypt_convert_block_skcipher() is unchanged; the new
crypt_convert_block_skcipher_multi() is reached from a small dispatch
in crypt_convert() and shares the same backlog/-EBUSY/-EINPROGRESS
flow control with the per-sector path.

Heap-allocated scatterlists are stashed in dm_crypt_request and freed
in crypt_free_req_skcipher() to avoid races between the synchronous-
success free path and async-completion reuse from the request pool.
On -ENOMEM during scatterlist allocation, the bio is requeued via
BLK_STS_DEV_RESOURCE rather than failed, matching the behaviour of
the existing -ENOMEM path for crypto request allocation.

Verified end-to-end with a byte-equivalence test: encrypted output of
plain64 dm-crypt with the multi-data-unit path matches output of the
single-data-unit path bit-for-bit over a 256 MB device.

Signed-off-by: Leonid Ravich <lravich@amazon.com>
@blktests-ci
Copy link
Copy Markdown
Author

blktests-ci Bot commented May 19, 2026

Upstream branch: 70eda68
series: https://patchwork.kernel.org/project/linux-block/list/?series=1097344
version: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants