Piggyback routing update on persist response#6173
Piggyback routing update on persist response#6173nadav-govari merged 9 commits intonadav/feature-node-based-routingfrom
Conversation
5ae7560 to
91fdf4e
Compare
| let state = IngesterState::load(wal_dir_path, rate_limiter_settings); | ||
|
|
||
| let weak_state = state.weak(); | ||
| let wal_capacity_time_series = |
There was a problem hiding this comment.
Let's fold the wal_capacity_time_series into the state. It's doable with a small refactor and gets rid of another Arc<<Mutex<_>>>.
| /// table. For existing nodes, updates their open shard count, including counts of 0, from the | ||
| /// CP response while preserving capacity scores if they already exist. | ||
| /// New nodes get a default capacity_score of 5. | ||
| pub fn merge_from_shards( |
There was a problem hiding this comment.
I think the comment is off because regular ingesters also provide routing updates and then why we need to handle the case where not all shards in the routing update are open.
There was a problem hiding this comment.
The case was if we had an ingester that had open shards before, but doesn't have any open shards now, an entry wouldn't be created in this function, and updating from (example) 4 open shards to 0 wouldn't happen (they'd all get filtered out). So the only case that matters here is the case where there are shards on the node but none of them are open.
006f951
into
nadav/feature-node-based-routing
Description
Addressing #6163 (comment), ingesters had more current shard availability data than the router that was sending the request. In the shard based table, we would use that information; the new node based table didnt.
This PR adds support for piggybacking the most up to date routing information on persist requests, including closed shards. It works because:
How was this PR tested?
Unit tests. Will test on a local cluster as well.