Skip to content

Fix #581: Add CRC32 checksum verification for pushdata/fetchdata data channels#887

Open
OmDoshi13 wants to merge 3 commits into
eBay:stable/v7.xfrom
OmDoshi13:fix/checksum-pushdata-fetchdata-581
Open

Fix #581: Add CRC32 checksum verification for pushdata/fetchdata data channels#887
OmDoshi13 wants to merge 3 commits into
eBay:stable/v7.xfrom
OmDoshi13:fix/checksum-pushdata-fetchdata-581

Conversation

@OmDoshi13
Copy link
Copy Markdown

@OmDoshi13 OmDoshi13 commented May 27, 2026

Summary

Fixes #581 — silent data corruption caused by network-layer corruption between leader and followers going undetected until user read/write time.

  • Root cause: Data left the leader intact but could be corrupted in the network layer. Followers accepted and stored whatever arrived; users only discovered corruption at read/write time.
  • Fix: Attach a CRC32 checksum to every data packet sent over the PushData and FetchData channels. The receiver recomputes the checksum and drops the packet on mismatch, letting Raft retry rather than writing corrupted bytes to disk.

Key implementation details

  • CRC32 computed over the full data payload at send time; verified before any write on the follower
  • FetchData response uses a self-describing magic-byte framing header (FETCH_DATA_RESPONSE_MAGIC) so receivers detect the header without consulting per-node config — safe under hotswap config asymmetry during rolling upgrades
  • data_checksum_enabled defaults to false; operators enable via hotswap after all nodes are running the new binary to avoid old nodes misinterpreting the framing header as block data
  • Added data_checksum_mismatch_cnt metric for observability
  • FlatBuffer verifier added on the receive path to guard against malformed headers
  • Per-entry lsn/dsn/raft_term attached to ResponseEntry for richer diagnostics

Correctness fixes (commit 3)

Three issues identified during review and fixed before merge:

  • F1 — Immediate re-fetch on FetchData checksum mismatch: Previously a mismatch silently dropped the entry and waited for the full data_receive_timeout_ms (10s) before Raft retried. Now mismatched rreqs are collected during the response loop and passed to check_and_fetch_remote_data() immediately after, eliminating the stall.
  • F2 — Unsigned underflow guard in handle_fetch_data_response: A truncated or malformed response could cause total_size -= data_size to wrap to ~4GB and advance raw_data past the buffer boundary in release builds. Added an explicit data_size > total_size guard with early return before each subtraction.
  • F3 — FlatBuffer verifier on PushData receive path: push_req fields were accessed before the buffer was verified in on_push_data_received. A corrupted or crafted push RPC could cause out-of-bounds reads via the unverified FlatBuffer accessor. Added flatbuffers::Verifier call before any field access, matching the pattern already present on the FetchData response path.

Test plan

  • Checksum_Enabled_PushData_Path — happy path with checksums on; writes commit correctly across all replicas
  • Checksum_Enabled_FetchData_Path — drops all push-data on followers forcing fetch path; verifies framing header is parsed correctly and data arrives intact
  • Existing Follower_Fetch_OnActive_ReplicaGroup still passes (no regression)

odoshi13 and others added 2 commits May 25, 2026 13:00
…ta channels (eBay#581)

During SM long-running tests, followers observed hash mismatches and invalid
headers during read-verify. Root cause: PushData and FetchData channels
transmitted raw payloads with no integrity check, allowing in-flight corruption
to be silently written to disk.

Changes:
- Add checksum: uint32 to PushDataRequest and ResponseEntry FlatBuffer schemas
- Add data_checksum_enabled: bool = true (hotswap) to Consensus config so the
  overhead can be toggled off at runtime for performance benchmarking
- Compute CRC32 over the data payload before sending on both push and fetch paths
- Verify CRC32 on receipt; drop and let retry on mismatch rather than writing
  corrupt data to disk
- FetchData response now optionally prepends a FetchDataResponse FlatBuffer
  header carrying per-entry checksums, gated on data_checksum_enabled to ensure
  backward compatibility during rolling upgrades

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Switch framing detection to magic-byte prefix (FETCH_DATA_RESPONSE_MAGIC)
  so receivers self-describe the header without consulting per-node config,
  making detection safe under hotswap config asymmetry between nodes
- Default data_checksum_enabled to false; operators enable via hotswap after
  all nodes are running the new binary to avoid old nodes misinterpreting
  the framing header as block data
- Add FlatBuffer verifier on the receive path to guard against malformed headers
- Fix header memory management: heap-allocate with new/delete[] instead of iobuf
- Attach per-entry lsn/dsn/raft_term to ResponseEntry for richer diagnostics
- Increment data_checksum_mismatch_cnt metric on every mismatch
- Add Checksum_Enabled_PushData_Path and Checksum_Enabled_FetchData_Path tests

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@OmDoshi13 OmDoshi13 changed the base branch from master to stable/v7.x May 27, 2026 12:21
F1: On FetchData checksum mismatch, immediately re-fetch via
check_and_fetch_remote_data() instead of waiting for the full
data_receive_timeout_ms (10s) stall. Mismatched rreqs are collected
in the response loop and re-queued after processing the valid entries.

F2: Guard against total_size unsigned underflow before each
total_size -= data_size subtraction in handle_fetch_data_response.
A truncated or malformed response could wrap total_size to ~4GB and
advance raw_data past the blob boundary in release builds.

F3: Add flatbuffers::Verifier call in on_push_data_received before
any PushDataRequest field is accessed. Without this, a corrupted or
crafted push RPC could cause out-of-bounds reads via the unverified
FlatBuffer accessor, matching the verification already present on the
FetchData response path.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@OmDoshi13 OmDoshi13 marked this pull request as draft May 27, 2026 12:26
@OmDoshi13 OmDoshi13 marked this pull request as ready for review May 27, 2026 12:28
Copy link
Copy Markdown
Collaborator

@xiaoxichen xiaoxichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OmDoshi13 thanks for taking this and helping with the tech debt, It looks nice in general.

I think your changes can be categorised into 3 parts
A: Corner cases fixes, mainly in your commit 3.
B: Moving FetchDataResponse from raw to Flatbuffer
C: CRC implementation.

I dont have any problem with A and hoping A can be done in a separate PR to allow it be merged sooner.

For B, I saw you added a FETCH_DATA_RESPONSE_MAGIC as well as corresponding logic to support rolling upgrade and hybrid version, that is nice. And yes, It is totally a mistake in the past that skipping flatbuffer instead using raw format...

however, using the magic header is another dead-end, as we have to carry this "compatibility" tax forever, it doesnt help us in next release that removing the magic header and merge back to the native flat-buffer approach.

With this thinking, do you think we can directly using FlatBuffer w/o MAGIC, as flatbuffer is size-prefixed, we can read the first size_bytes, and check if that number makes sense , then proceed to flatbuffer::verify, if that is also passing it is very likely it is new format, then we check CRCs of raw data against what was in flatbuffer.

Basically I am trying to suggest try-and-fallback pattern.Note the raw data should be a homeobject blobs/shards where has a DataHeader with magic uint64_t magic{0x21fdffdba8d68fc6};, , that makes the flatbuffer prefixed size to be 2.6GB, which will very probably fails the basic size check.

    struct DataHeader {
        static constexpr uint8_t data_header_version = 0x02;
        static constexpr uint64_t data_header_magic = 0x21fdffdba8d68fc6; // echo "BlobHeader" | md5sum

        enum class data_type_t : uint32_t { SHARD_INFO = 1, BLOB_INFO = 2 };

        bool valid() const { return ((magic == data_header_magic) && (version <= data_header_version)); }

        uint64_t magic{data_header_magic};
        uint8_t version{data_header_version};
        data_type_t type{data_type_t::BLOB_INFO};
    };

@xiaoxichen
Copy link
Copy Markdown
Collaborator

xiaoxichen commented May 28, 2026

For C:
Lets try to separate the enablement of CRC with the wire-format changes. If in B we moved to Flatbuffer, then checksum would have a default value of zero if not enabled, we probably want to skip the CRC checking if CRC field is zero.

A bit more context on the CRC: once we added length check, we never saw data corruption on wire anymore. Basically it is hard to see bit rot in TCP protocol as TCP protocol already provides its CRC in TCP-Header. Usually the error was caused by early terminating --- sender disconnected before sending full data, receiver received partial data and blindly using it without checking the prefix-size.

With this assumption and observation in production , I think we wont enable the CRC for generic use cases, only use in more durability sensitive environment if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add checksum for pushdata/fetchdata

3 participants