Configurable index disabling for virtual columns#19004
Configurable index disabling for virtual columns#19004nozjkoitop wants to merge 2 commits intoapache:masterfrom
Conversation
have you tried in versions newer than 29? #15838 was in 30, and #17055/#17125 in 31, which should make it so that expression indexes are only used in cases where there would be less work to do than a full scan (which has to compute the expression value and perform the match for every row), which i think should reduce problems like you are seeing. I would be very interested if there are still cases where expression indexes cause a slowdown after those changes, since it would indicate that perhaps the cost estimate needs adjustment. |
|
Thanks for the references, unfortunately this still reproduces for us on 31. |
that would be great, there is SqlExpressionBenchmark which uses a data generator to have a collection of columns with various types and data distributions maybe you can add something that looks approximately like the problem you're running into. I would like to try to see if this is something we can improve without adding a new manual parameter |
|
added a case with broad, expression-heavy virtual-column filters |
|
hmm, I still show the added benchmark query as faster with indexes, especially for vectorized processing. I didn't pull your branch, just used apache master with the query added and to test no indexes modified this line ExpressionVirtualColumn to return current master: modified so that no indexes are used: |
|
Interesting, although I was using lz4 compression + FRONT_CODED_16_V1 encoding, ur results surprised me :) lemme give it another try Could u share the JMH flags used please? |
I'm using java 21 on a m1 mac for additional context.
i think by default everything here would be using lz4 since that parameter is only for complex columns to measure #16863 and has no impact if not using complex columns, but I will try with front coding though, since significantly slower perf there would also be quite interesting to look into |
|
thanks, i'll return it with matching params and let u know about results |
|
a neat, I do show front-coded as a bit slower for non-vectorized, which is curious, will look into that a bit more with index: without: |
|
re front-coding, i think i see what is going on, the I think since we are checking every dictionary value, it would be a lot more chill for front-coding if use used the dictionary iterator instead of calling get, it needs to be exposed on |
|
ah yea, it is totally that, using the iterator improves the measurement on using the indexes quite a lot with indexes using iterator: thanks for bringing this to attention 👍 |
|
opened #19023 to help perf for front-coding + expression indexes (and maybe a few other things) |
|
appreciate the input, that's a clean fix, nice |
Description
In some of our environments we observed a significant query performance regression after upgrading to Druid 29.0. This regression appears to be related to the virtual column bitmap indexing optimization introduced in #15585 and #15633.
While the change improves performance for many cases by enabling bitmap index creation for expression-based virtual columns, we found that in certain workloads the index computation becomes unexpectedly expensive. In our deployments, building these indices accounted for more than 90% of the CPU usage on Historical nodes during query execution, leading to severe degradation in overall query latency.
For one representative workload, the average query runtime increased from 10.01 seconds to 150.18 seconds compared to earlier versions. Once expression-based virtual column bitmap index creation was disabled, the regression disappeared and query performance returned to expected levels.
This PR addresses this issue by preventing the optimization from triggering in scenarios where the cost of index computation outweighs its benefit, avoiding major regressions in affected environments.
Added a new query context parameter,
maxVirtualColumnsForBitmapIndexing(default:Integer.MAX_VALUE), which sets the virtual-column count threshold beyond which Druid stops using bitmap indexes for filters on virtual columns.Release note
Added a safeguard to skip virtual-column bitmap indexing when it is likely to be counterproductive, falling back to non-indexed filtering to preserve expected performance.
This PR has: