Rebalance shards when ingester status changes by ncoiffier-celonis · Pull Request #6185 · quickwit-oss/quickwit

ncoiffier-celonis · 2026-03-02T11:39:44Z

Description

Attempt to fix #6158

Following @guilload's suggestion here, this PR:

gossip the ingester status over chit chat
update the ingester pool when ingester status changes
update the indexer pool too when ingester status changes (to fix no open shard found on ingester error)
have the control plane rebalance the shards when the ingester status changes

With this approach, even if we have some 10s propagation delay before decomissioning, it is still possible to fail to ingest some documents if the chitchat takes longer than expected to gossip the ingester status to the control-plane.

Any feedback is welcome!!

How was this PR tested?

In addition of the unit and integration tests, I've run it against a local cluster with 2 indexer and observed that the number of errors reported in #6158 decreases from a few 100 to no errors.

Other approches

This PR is fairly identical to the branch guilload/ingester-status, rebased on main and with some additional bugfixes:

fix bug in timeout_after being always 0, causing to not wait
update ingester pool when IngesterStatus change (not only indexer pool)
more unit and integration tests

…opagation

guilload and others added 8 commits March 2, 2026 12:22

Gossip ingester status

fd38f7b

Update ingester pool when status changes

616d009

Rebalance shards when IngesterStatus changes

acf74cf

Fix timeout_after being 0, causing to not wait for ingester status pr…

1a1130b

…opagation

Also refresh the ingester pool when an ingester status has changed

78582a1

Add integration test

41985ec

Make setup_ingester_pool and setup_indexer_pool a bit more uniform

d3983ce

make fix

c2b5ce2

ncoiffier-celonis mentioned this pull request Mar 2, 2026

Exclude decomissioning nodes when opening new shards, using gRPC stream #6166

Open

Instrument rebalance_shards calls

81e493d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebalance shards when ingester status changes#6185

Rebalance shards when ingester status changes#6185
ncoiffier-celonis wants to merge 9 commits intoquickwit-oss:mainfrom
ncoiffier-celonis:ingester-status-rebased

ncoiffier-celonis commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ncoiffier-celonis commented Mar 2, 2026

Description

How was this PR tested?

Other approches

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants