Opt rdma flags#3300
Open
randomkang wants to merge 2 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR aims to improve RDMA throughput by adjusting the access flags used during memory registration, enabling relaxed ordering to reduce ordering overhead in the data path.
Changes:
- Update the dynamically-loaded
ibv_reg_mrfunction pointer signature to take anintaccess mask. - Register RDMA memory with
IBV_ACCESS_RELAXED_ORDERINGin both the internal block-pool path and the publicRegisterMemoryForRdmaAPI.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Register the memory as callback in block_pool | ||
| // The thread-safety should be guaranteed by the caller | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, size, IBV_ACCESS_LOCAL_WRITE); | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_RELAXED_ORDERING); |
Contributor
There was a problem hiding this comment.
Some hardware devices may not support IBV_ACCESS_RELAXED_ORDERING. Could a runtime checking mechanism be implemented?
Contributor
There was a problem hiding this comment.
如果 mr = IbvRegMr(g_pd, buf, size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_RELAXED_ORDERING);
就执行 mr = IbvRegMr(g_pd, buf, size, IBV_ACCESS_LOCAL_WRITE);
|
|
||
| uint32_t RegisterMemoryForRdma(void* buf, size_t len) { | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, len, IBV_ACCESS_LOCAL_WRITE); | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, len, IBV_ACCESS_LOCAL_WRITE |IBV_ACCESS_RELAXED_ORDERING); |
|
|
||
| uint32_t RegisterMemoryForRdma(void* buf, size_t len) { | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, len, IBV_ACCESS_LOCAL_WRITE); | ||
| ibv_mr* mr = IbvRegMr(g_pd, buf, len, IBV_ACCESS_LOCAL_WRITE |IBV_ACCESS_RELAXED_ORDERING); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Opt the performance of rdma, the bandwith of single qp can be can be increased from 7GB/s to 15GB/s.
What is changed and the side effects?
Changed:
Side effects:
Performance effects:
Breaking backward compatibility:
Check List: