crypto: skcipher - add per-tfm data_unit_size for batched requests#853
Open
blktests-ci[bot] wants to merge 4 commits into
Open
crypto: skcipher - add per-tfm data_unit_size for batched requests#853blktests-ci[bot] wants to merge 4 commits into
blktests-ci[bot] wants to merge 4 commits into
Conversation
added 4 commits
May 19, 2026 12:09
Add a per-tfm data_unit_size and an algorithm capability flag that together allow a caller to submit several data units in a single skcipher request. The IV passed in the request applies to the first data unit; the algorithm advances the tweak between data units according to the mode specification (e.g., LE128 multiply for XTS per IEEE 1619). This mirrors the data_unit_size concept already exposed by struct blk_crypto_config for inline encryption hardware, but at the software skcipher layer. The first user is dm-crypt, which today issues one request per sector and so pays a per-sector cost in request allocation, IV generation, callback dispatch, and completion handling. Allowing the cipher to consume a whole bio per request removes that overhead for drivers that can chain across data units internally. The data_unit_size lives on struct crypto_skcipher rather than on struct skcipher_request because it does not change between requests for any plausible consumer: dm-crypt picks one sector size per mapped target at table load time; fscrypt would pick one per master key. Anchoring it to the tfm also lets the driver validate it once at setkey() time and avoids per-request initialisation hazards on mempool-recycled requests. Capability is advertised with CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags (type-specific high-byte range, mirroring the CRYPTO_AHASH_ALG_* convention). This makes the capability visible in /proc/crypto and lets templates OR it into their derived algorithms. crypto_skcipher_set_data_unit_size() returns -EOPNOTSUPP if the algorithm does not advertise the flag, and accepts 0 (the default) unconditionally so callers can re-disable batching cheaply. crypto_skcipher_encrypt()/decrypt() reject requests whose cryptlen is not a multiple of the configured data_unit_size with -EINVAL. The check is gated on data_unit_size != 0 so it costs nothing for the common single-data-unit case. No in-tree algorithm advertises the flag yet; subsequent patches add the generic xts() template, arm64, and x86 producers as well as the dm-crypt consumer. Signed-off-by: Leonid Ravich <lravich@amazon.com>
Teach the generic xts() template to consume cryptlen larger than one data unit when the caller has configured a non-zero data_unit_size on the tfm. Each data unit is processed with its own IV, derived from the caller-supplied IV by treating it as a 128-bit little-endian counter and adding the data-unit index. This matches the sector-indexed XTS used by dm-crypt's plain64 IV mode and by typical inline-encryption hardware. The single-data-unit body is unchanged and is now reached via a thin xts_crypt_multi() dispatcher that skips straight to the body when data_unit_size is zero (the legacy default), so existing users see no extra cost. Advertise CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags only when the inner cipher is synchronous. An async inner cipher would require a per-DU completion chain which is out of scope for the slow software template; consumers that need multi-DU on async hardware will use one of the arch-specific drivers added later in this series. Signed-off-by: Leonid Ravich <lravich@amazon.com>
Add a self-comparison test that runs whenever an skcipher algorithm
advertises CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags. The test
encrypts the same random plaintext two ways:
1. as one batched request with data_unit_size set, and
2. as N back-to-back single-data-unit requests with IVs derived from
the original IV by adding the data-unit index (treated as a
128-bit little-endian counter, matching the convention documented
in crypto_skcipher_set_data_unit_size()).
Both encrypts must produce byte-identical ciphertext, otherwise the
algorithm's multi-DU implementation is inconsistent with its single-DU
behaviour. Iterates over a fixed set of typical data unit sizes
(512, 1024, 2048, 4096) which cover the dm-crypt sector-size range.
The test is gated on ivsize == 16 (XTS, the only multi-DU consumer in
the kernel today) and on the algorithm advertising the capability,
so it costs nothing for the existing fleet of skcipher drivers.
Signed-off-by: Leonid Ravich <lravich@amazon.com>
When the underlying skcipher driver advertises support for multiple data units in a single request (CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT), configure the cipher with cc->sector_size as data_unit_size and submit one request per bio instead of one request per sector. This removes per-sector overhead in the crypto API hot path: request allocation, callback dispatch, completion handling, and SG setup. The optimisation is enabled automatically at table load when all of the following hold: - the cipher is non-aead (i.e. skcipher); - tfms_count is 1 (interleaved per-sector keys would break batching); - the IV mode is plain or plain64 (the only modes whose generator produces a sequential 64-bit little-endian counter that the cipher can extend by adding the data-unit index, matching the convention documented in crypto_skcipher_set_data_unit_size()); - the iv_gen_ops->post() hook is unset (lmk and tcw use it; both are already excluded by the IV-mode test, but the explicit check makes the assumption durable against future IV modes); - dm-integrity is not stacked (no integrity tag or integrity IV); - the cipher driver advertises multi-data-unit support. A new CRYPT_MULTI_DATA_UNIT cipher_flag, set once at construction time, gates the multi-data-unit path. The existing per-sector path in crypt_convert_block_skcipher() is unchanged; the new crypt_convert_block_skcipher_multi() is reached from a small dispatch in crypt_convert() and shares the same backlog/-EBUSY/-EINPROGRESS flow control with the per-sector path. Heap-allocated scatterlists are stashed in dm_crypt_request and freed in crypt_free_req_skcipher() to avoid races between the synchronous- success free path and async-completion reuse from the request pool. On -ENOMEM during scatterlist allocation, the bio is requeued via BLK_STS_DEV_RESOURCE rather than failed, matching the behaviour of the existing -ENOMEM path for crypto request allocation. Verified end-to-end with a byte-equivalence test: encrypted output of plain64 dm-crypt with the multi-data-unit path matches output of the single-data-unit path bit-for-bit over a 256 MB device. Signed-off-by: Leonid Ravich <lravich@amazon.com>
Author
|
Upstream branch: 70eda68 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: crypto: skcipher - add per-tfm data_unit_size for batched requests
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1097344