Add self-merge check to Message::MergeFrom#26742
Add self-merge check to Message::MergeFrom#26742rep0z-debug wants to merge 5 commits intoprotocolbuffers:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
We currently have debug checks for this, which I don't think we should upgrade to As for checking Unless we have a strong reason to support |
|
I understand the performance concern, but I can demonstrate that this causes memory safety issues in release builds. CopyFrom already has this check - shouldn't MergeFrom have similar protection? A single pointer comparison seems like reasonable defensive programming. |
|
My point was mainly that returning early on The philosophy of a lot of Protobuf code is to verify correctness with debug checks, but don't enable those checks in release builds. Protobuf assumes your code is well-formed for release builds, which allows it to skip many otherwise redundant checks. Ideally applications should have robust integration testing with these debug checks enabled, which allows you to catch malformed code before it hits production. If you would prefer to have debug checks enabled for release builds, you can build the protobuf library with |
|
I understand, but self-aliasing can happen accidentally in production (e.g., through reference passing, container operations). When it does, the result is memory corruption, not just incorrect behavior. Other Google projects have moved toward more defensive programming for memory safety. Would there not be any middle ground here? For example, upgrading the existing DCHECK to CHECK would catch this without adding new code or branches to the happy path. Also CopyFrom already has this check - wouldn't parity between CopyFrom and MergeFrom make sense? |
|
Code outside of Protobuf which may, even unintentionally, call self-merge is malformed code. Protobuf assumes your code is well-formed, and on hot code paths (which this is), we have decided it is not worth the performance cost to verify again that your code is indeed well-formed. Integration tests are meant to catch these problems. Upgrading to a If you have code which is prone to accidental aliasing, you could have your own runtime |
|
I understand your position, though I respectfully disagree with the tradeoff being made here. Self-aliasing through reference passing, container operations, and callback patterns is a realistic production scenario, not just theoretical misuse. When it occurs, the result is exploitable memory corruption. I believe a CHECK (not supporting self-merge semantics, just failing safely) is the appropriate middle ground between performance and memory safety. The performance cost of a correctly-predicted branch is negligible compared to the security and debugging costs of heap corruption in production. That said, I understand this comes down to project philosophy and you've made the decision clear. I'll close this PR, but I stand by the assessment that this represents a memory safety issue that should be addressed. Thank you for your time reviewing this. |
MergeFrom lacks the self-aliasing check that CopyFrom has (message.cc:375).
In release builds,
msg.MergeFrom(msg)causes heap corruption:This patch:
if (&from == this) return;to Message::MergeFrom (matching CopyFrom)if (&other == this) return;to MessageLite::CheckTypeAndMergeFrom$DCHK$to$CHK$in the generated MergeImpl template