Fix integer overflow vulnerability in BalanceTrajectory and ComputeSmall#1075
Fix integer overflow vulnerability in BalanceTrajectory and ComputeSmall#1075mhucka wants to merge 5 commits into
BalanceTrajectory and ComputeSmall#1075Conversation
BalanceTrajectoryBalanceTrajectory and ComputeSmall
There was a problem hiding this comment.
Code Review
This pull request updates the calculation of sample repetitions and trajectory balancing by switching from int to uint64_t to prevent potential overflows. Feedback indicates that using uint64_t introduces a risk of Denial of Service (DoS) if num_samples is negative, as the unsigned cast would create an extremely large loop count. It is recommended to use int64_t for these calculations and to implement validation to ensure num_samples is non-negative.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request improves input validation in tfq_noisy_samples.cc by replacing debug assertions with runtime checks for the number of inputs and adding a range check for num_samples. However, the review feedback indicates that the current upper bound check for num_samples is insufficient to prevent integer overflows in downstream calculations involving thread counts and circuit sizes. It is recommended to use int64_t for intermediate calculations, apply a safer ceiling division formula, or enforce a more conservative upper bound on num_samples to protect all downstream 32-bit calculations.
| OP_REQUIRES_OK(context, GetIndividualSample(context, &num_samples)); | ||
| OP_REQUIRES( | ||
| context, | ||
| num_samples >= 0 && num_samples < std::numeric_limits<int>::max(), |
There was a problem hiding this comment.
The check num_samples < std::numeric_limits<int>::max() is insufficient to prevent the integer overflow vulnerability described in the pull request. The expression (num_samples + num_threads - 1) / num_threads (used in ComputeSmall at line 241 and in BalanceTrajectory) can still overflow if num_samples is large but less than INT_MAX. For example, if num_threads is 128, any num_samples > INT_MAX - 127 will cause the addition to overflow into a negative value, leading to the memory leakage issue described.
Furthermore, other calculations such as 2 * num_samples * ncircuits.size() + 2 (line 167) and 4 * (num_samples * ncircuits.size() + num_threads) / num_threads (line 261) are also susceptible to overflow if num_samples is large.
A more robust fix would be to:
- Use
int64_tfor intermediate calculations involvingnum_samplesinComputeSmallandBalanceTrajectory. - Use a safer ceiling division formula:
num_samples / num_threads + (num_samples % num_threads != 0)which avoids the overflow-prone addition. - Apply a more conservative upper bound on
num_samplesat this check (e.g.,INT_MAX / 2) to protect all downstream 32-bit calculations.
Internal Google security scans flagged a memory access vulnerability from a potential integer overflow (b/510431411). The vulnerability originates from the calculation
(num_samples + num_threads - 1) / num_threadsin bothBalanceTrajectory(util_balance_trajectory.cc:78) andComputeSmall(tfq_noisy_samples.cc:231).num_samplesis anint32provided by the user. Providing a value close toINT_MAXoverflows the addition into a negative number when thread counts are added. The resulting negativep_repstrickles down toneeded_samples <= 0, bypassing the core generation loop. Sincetensorflow::allocate_outputmaps raw heap buffers without automatic initialization for numerical outputs, returning it unmodified results in a direct leakage of the uninitialized data.This is admittedly unlikely to happen in realistic usage scenarios, but who knows, maybe someone could create a malicious scenario to force it.
The approach taken here is simply to do a bounds check on the value of
num_samples.