This repository was archived by the owner on Jul 19, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 165
Tobin synch event model #2026
Closed
Closed
Tobin synch event model #2026
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the initial version of a single-case update to handle synchronization tracking for the cmd sequence from GH892. As the code stands it's only primarily useful for this single cmd sequence and isn't catching any variety of memory conflicts. The intention is that if the general idea is sound then I will build on this initial code to expand/generalize the algorithms used here for all memory conflict cases.
This is a simple test positive case based on GH892. For this initial test the pipeline barrier is correctly in place so no errors should be signalled. This is the basic case that will be used to generate some more test cases, both positive and negative.
This is a negative test featuring a RaW dependency with no synch object in-between to regulate the read.
Added a look-up-table for fixed pipe stage and access masks per cmd. Added some functions to generalize adding/validating memory accesses per command and migrated the checks into buffer_validation files. Added CmdCopyImage tracking. Initially treating all image accesses conservatively and assuming they touch full size of image memory. To deal with this and further-such issues that will arise I updated MemoryAccess to have a "precise" bool that indicates when Access details are exact. Only when both Accesses are precise can we issue an error, otherwise just issue a warning.
Integrate CmdCopyBuffer into MemoryAccess recording and validation framework. Refactored recording of read/write memory accesses to use the binding which saves some repeated lines of code.
Update synchronization modeling to handle barrier dependency chains. Initially only handling CmdPipelineBarriers. Create separate synch_commands vector in cmd buffer for quick parsing of just synch commands. For any previous synch commands whose destination synch scope overlaps the current synch command's source synch scope, merge the 1st cmds dest mask into current commands src mask and pull previous barriers into the current synch command. Updated existing PipelineBarrier algorithm for global and buffer barriers to make sure that pipeline masks overlap and only check access masks in that case
Adding positive test UpdateBufferWaRDependencyWithTwoBarrierChain. This features a WaR dependency that's cleared by a chain of two pipeline barriers. This confirms that validation correctly supports the merging of overlapping barriers to correctly account for synchronization chains.
Add tracking for the memory accesses in CmdBlitImage. Initially just adding a read & write access for the src & dest image memory that conservatively references whole binding and is recorded as imprecise so that it only will cause warnings.
Add tracking for the memory access in CmdFillBuffer. This is a single write to a buffer for which we have precise info.
Add tracking for the memory accesses in CmdClear[DS|Color]Image. Initially just adding imprecise write access covering the whole image binding area.
Update CmdClearAttachments to record a write memory access per image attachment that's cleared. This is an imprecise access across the entire image binding for now.
Update global mem access state when CmdClearAttachments is called.
Add tracking for the memory accesses in CmdResolveImage. Initially just adding a read & write access for the src & dest image memory that conservatively references whole binding and is recorded as imprecise so that it only will cause warnings.
Just moving code to prep for adding mem access support to CmdCopyQueryPoolResults. Refactor it to use the common PreValidate* & PostRecord* pattern.
Add the buffer write access for CmdCopyQueryPoolResults. Updated query pool data struct to store number of elements in the query based on query type and then use that number when calculating total data size.
Update the Draw/Dispatch validate/update framework to account for all memory accesses through descriptors. At validate time grab the complete set of active read and write memory bindings from descriptors. Then verify the memory accesses to flag any conflicts. During state update, record all of the accesses into the cmd buffer state. All of these accesses are imprecise and will only result in warnings if there is a potential synch issue.
A PipelineBarrier command that occurs within a renderPass is only for handling a subpass self-dependency. That case is not currently included in the synch model so updating code to only record barriers that are outside of a renderPass for now.
Moved existing code into RecordBarrierMemoryAccess() function. This generalizes to code so it can more easily be multi-purposed in order to record barriers for CmdWaitEvents().
This is a race condition test to check both a RaW and WaR conflict with buffers used in a Draw. A src buffer is copied to a dst buffer, then, without a barrier, the src is used as a storage buffer in a draw and the dst is used as a uniform buffer in the same draw. This results in a RaW conflict for the Dst/Uniform buffer and WaR conflict for the Src/Storage buffer.
Make any memory access race condition conflict messages warning-level for initial release. There are still some holes in modeling the synch primitives in validation so setting callback to warning allows for a soft release where people can ignore or post bugs on incorrect warnings while the synch model is updated in validation. The current warning message is meant to deter developers from sinking too much time into debugging these warnings, and promotes feedback for known-bad warning cases.
Record barriers in CmdWaitEvents using same method used for barriers from CmdPipelineBarriers. This involves merging any previous barriers to handle dependency chains and then recording the barriers into any previous memory accesses that fall into the source synch scope.
Modify ValidateMemoryAccesses() so that it takes map of prev accesses directly. This is preparation to share the function with command buffer submission validation where we'll deal with sets of initial accesses that may cross many command buffers.
At QueueSubmit time, check for any memory conflicts between command buffers in a single submit. Here's the high-level algorithm: 1. While submitting CBs, store vector of CB state for each CB 2. For each CB, create vector of mem accesses that are "live" meaning that no barrier within the CB made them visible 3. Check live vector against all other live mem accesses up to this point in submit. 3a. If there are no potential conflicts, save live access vector into submit access map and continue with next submit 3b. If there are potential conflicts, we need to replay commands up to this point so continue to step 4. 4. Gather all cmds submitted up to this point in a single vector 5. Note any synch commands that occur between the early and late mem access conflict commands 6. Replay mem access commands, including synch commands between the points of interest 7. Following replay, re-check for conflicts and if they still occur, flag an error Also update the race condition warning message between two separate command buffers with additional information about second cmd buffer.
Added TwoCommandBufferUpdateBufferRaWDependencyMissingBarrier test that is an alternate version of UpdateBufferRaWDependencyMissingBarrier test that separates the CmdUpdateBuffer and CmdDrawIndirect commands across two separate command buffers and verifies that the RaW race condition is flagged.
Merging synch barriers may be a bit naive. Note potential bug for future fix.
Add MultiCBDrawWithBufferWaRandRaWSynchChain test. This features three command buffers with mem access dependencies and synch commands that cross CB boundaries. This is a positive test to verify that no errors occur.
Mis-match between size_t and uint32_t was causing some Windows compiler warnings. Use size_t uniformly where vector size() function involved.
Anytime we need to replay cmds during queue submit, we should start with an empty replay cmd vector, since replay always starts from the beginning of a submission.
Collaborator
|
As mentioned offline, this has brutal perf impact. Frame times in |
Contributor
Author
|
Replaced by #2049 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is building on and replacing #2005
This includes further commits to handle inter-CB memory access conflicts.