Expand vector-buffers property model to include crashes, writebacks#25639
Draft
blt wants to merge 1 commit into
Conversation
9 tasks
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
9 tasks
6efa8f9 to
bfe43c7
Compare
9ed8d56 to
b8dfc5c
Compare
bfe43c7 to
36555b5
Compare
This commit expands on the property model for disk buffers v2, taking inspiration from the paper [All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications](https://www.usenix.org/conference/osdi14/technical-sessions/presentation/pillai). I have expanded `Action` to have two new variants, `Writeback` and `Crash`. Crash is the simpler, it causes the PBT to simulate a restart of the buffer. The filesystem is updated to have a notion of 'durable' writes, those writes which are not durable are dropped after a crash action and restart. Writeback allows us to simulate OS non-determinism around when mmaps are flushed to disk, directories are synced etc without explicit syscalls to force that. Similar also to the ALICE paper the PBT Filesystem is updated to have a notion of atomicity in writes, whether either block sized or sector sized. The model remains ideal, that is, the model is built such that all writes are durable once a writer ack is sent backward. This means the test now fails because the SUT does not behave this way, now easily demonstrated. A minimal failure sequence: ``` [ WriteRecord(Record { id: 0, size: 0, event_count: 1, encoded_len: 12, archived_len: 64, .. }), Crash, FlushWrites, ] ``` Currently disk buffers v2 are not crash safe. This was flagged by our antithesis test introduced in PR #25562 and is reproduced here also, although these faults are hidden behind the more cheap-to-reproduce failure above. I believe we are missing a few things in the SUT: * synchronization: of directory changes, that is, file names do not become durable until synced * _either_ bulk commits on flush interval with ack handlers not releasing until that commit _or_ flush interval is dropped entirely with acks == sync, which is very slow.
b8dfc5c to
8a11daa
Compare
36555b5 to
97cc74d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
This commit expands on the property model for disk buffers v2, taking inspiration from the paper All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications. I have expanded
Actionto have two new variants,WritebackandCrash. Crash is the simpler, it causes the PBT to simulate a restart of the buffer. The filesystem is updated to have a notion of 'durable' writes, those writes which are not durable are dropped after a crash action and restart. Writeback allows us to simulate OS non-determinism around when mmaps are flushed to disk, directories are synced etc without explicit syscalls to force that. Similar also to the ALICE paper the PBT Filesystem is updated to have a notion of atomicity in writes, whether either block sized or sector sized.The model remains ideal, that is, the model is built such that all writes are durable once a writer ack is sent backward. This means the test now fails because the SUT does not behave this way, now easily demonstrated. A minimal failure sequence:
Currently disk buffers v2 are not crash safe. This was flagged by our antithesis test introduced in PR #25562 and is reproduced here also, although these faults are hidden behind the more cheap-to-reproduce failure above.
I believe we are missing a few things in the SUT:
Vector configuration
How did you test this PR?
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.make fmtmake check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix)make testgit merge origin masterandgit push.Cargo.lock), pleaserun
make build-licensesto regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.