Document try_enqueue implicit producer index limit (#418)#446
Document try_enqueue implicit producer index limit (#418)#446PavelGuzenfeld wants to merge 2 commits intocameron314:masterfrom
Conversation
911509e to
1a7c45f
Compare
|
Hi @cameron314, just a friendly ping on this PR. Would you have a chance to take a look when convenient? Happy to address any feedback. Thank you! |
cameron314
left a comment
There was a problem hiding this comment.
Thanks for opening this PR. This is definitely a gotcha that's bitten quite a few people over the years and improving the documentation around it is definitely worth doing.
tests/unittests/unittests.cpp
Outdated
| } | ||
|
|
||
| { | ||
| // Fix #1: enqueue() (which can allocate) works past the limit |
There was a problem hiding this comment.
To me, this comment reads as if this is testing a fix in the code, whereas it's actually demonstrating a workaround to the API behaving as designed.
| // Fix #1: enqueue() (which can allocate) works past the limit | |
| // Workaround #1: enqueue() (which can allocate) can grow the index |
tests/unittests/unittests.cpp
Outdated
| } | ||
|
|
||
| { | ||
| // Fix #2: larger IMPLICIT_INITIAL_INDEX_SIZE allows more try_enqueue calls |
There was a problem hiding this comment.
| // Fix #2: larger IMPLICIT_INITIAL_INDEX_SIZE allows more try_enqueue calls | |
| // Workaround #2: larger IMPLICIT_INITIAL_INDEX_SIZE allows more try_enqueue calls |
concurrentqueue.h
Outdated
| // Note: This controls the maximum number of elements that can be enqueued by a | ||
| // single implicit producer when using try_enqueue (which does not allocate). | ||
| // The limit is BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE elements; beyond that, | ||
| // the block index needs to grow, which requires allocation, causing try_enqueue | ||
| // to fail. Increase this value (or use enqueue(), which can allocate) if you need | ||
| // more capacity per implicit producer. See also issue #418. |
There was a problem hiding this comment.
This is one limiting factor, but not the only one (BLOCK_SIZE plays a role too, and the number of blocks preallocated, and it's possible to mix try_enqueue and enqueue).
I'm also not sure I like stating BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE elements, since blocks are not necessarily full: if all blocks are filled up to this many elements, then one element is dequeued, a new element still can't be enqueued until the entire first block is emptied.
Also not sure I like referencing github issues in non-test code.
How about:
| // Note: This controls the maximum number of elements that can be enqueued by a | |
| // single implicit producer when using try_enqueue (which does not allocate). | |
| // The limit is BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE elements; beyond that, | |
| // the block index needs to grow, which requires allocation, causing try_enqueue | |
| // to fail. Increase this value (or use enqueue(), which can allocate) if you need | |
| // more capacity per implicit producer. See also issue #418. | |
| // Note: This impacts the maximum number of elements that can be enqueued by a | |
| // single implicit producer when using try_enqueue/try_enqeue_bulk exclusively (which | |
| // cannot allocate), since it limits the number of blocks that the producer can hold to | |
| // store elements. When pre-allocating blocks for use with try-enqueueing, configure | |
| // this initial size to the desired maximum number of blocks per implicit producer. | |
| // Alternately, use the regular enqueue methods, which can grow the index as needed. |
| // memory barrier). | ||
| explicit BlockingConcurrentQueue(size_t capacity = 6 * BLOCK_SIZE) | ||
| : inner(capacity), sema(create<LightweightSemaphore, ssize_t, int>(0, (int)Traits::MAX_SEMA_SPINS), &BlockingConcurrentQueue::template destroy<LightweightSemaphore>) | ||
| : inner(capacity), sema(create<LightweightSemaphore, ssize_t, int>(0, static_cast<int>(Traits::MAX_SEMA_SPINS)), &BlockingConcurrentQueue::template destroy<LightweightSemaphore>) |
There was a problem hiding this comment.
These casts were also introduced in a separate PR and now conflict with master.
concurrentqueue.h
Outdated
| // Note: For implicit producers (no token), the maximum number of elements that | ||
| // can be held is BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE before the block | ||
| // index must grow (which requires allocation and causes try_enqueue to fail). | ||
| // Pre-allocating blocks via the constructor does not increase the index size. | ||
| // Increase IMPLICIT_INITIAL_INDEX_SIZE in your traits, or use enqueue() instead. |
There was a problem hiding this comment.
This method does not accept a token, so it is implicit by definition.
Given the thorough note in the trait docs, I think it's fine to leave a shorter comment here referencing the traits.
| // Note: For implicit producers (no token), the maximum number of elements that | |
| // can be held is BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE before the block | |
| // index must grow (which requires allocation and causes try_enqueue to fail). | |
| // Pre-allocating blocks via the constructor does not increase the index size. | |
| // Increase IMPLICIT_INITIAL_INDEX_SIZE in your traits, or use enqueue() instead. | |
| // Note: If using only try_enqueue/try_enqueue_bulk with pre-allocated blocks, configure | |
| // Traits::IMPLICIT_INITIAL_INDEX_SIZE appropriately to ensure the index has sufficient | |
| // capacity for the number of blocks each producer may need. |
There are also other implicit try-enqueueing methods to which this note would equally apply. It should appear consistently on each.
…ameron314#418) try_enqueue() fails around BLOCK_SIZE * IMPLICIT_INITIAL_INDEX_SIZE elements per implicit producer because the block index (not the blocks) must grow, which requires allocation that try_enqueue refuses to do. Pre-allocating blocks via the constructor does not help. Added documentation comments near IMPLICIT_INITIAL_INDEX_SIZE in the default traits and near the try_enqueue method explaining the limit and workarounds. Added implicit_producer_index_limit unit test demonstrating the issue and verifying both workarounds (enqueue() and larger IMPLICIT_INITIAL_INDEX_SIZE).
- Reword traits doc comment per maintainer's suggestion (no formula, no issue ref) - Shorten try_enqueue note and add it consistently to all implicit try-enqueue methods - Rename "Fix" to "Workaround" in test comments (behavior is by design)
b2f1b31 to
217f4af
Compare
PavelGuzenfeld
left a comment
There was a problem hiding this comment.
Applied all suggested changes and rebased on master.
Summary
ConcurrentQueueDefaultTraits::IMPLICIT_INITIAL_INDEX_SIZEexplaining the relationship betweenBLOCK_SIZE,IMPLICIT_INITIAL_INDEX_SIZE, and the maximum elements per implicit producer when usingtry_enqueuetry_enqueuemethod that pre-allocating blocks via the constructor does not increase the block index sizeimplicit_producer_index_limitunit test demonstrating the limit and both workarounds (useenqueue()or increaseIMPLICIT_INITIAL_INDEX_SIZE)Test plan
implicit_producer_index_limitpasses (8 iterations, all green)Fixes #418