HIVE-29506: HiveRelDecorrelator fails to decorrelate plan with Values operator#6360
HIVE-29506: HiveRelDecorrelator fails to decorrelate plan with Values operator#6360kasakrisz wants to merge 3 commits intoapache:masterfrom
Conversation
c1e7433 to
3331420
Compare
| -- To create a HiveValues operator before decorrelation automatic query rewrite with | ||
| -- materialized views is used. | ||
| -- All tables must be transactional to trigger automatic query rewrite. | ||
| create table mv_source (any_col int) stored as orc TBLPROPERTIES ('transactional'='true'); |
There was a problem hiding this comment.
not sure we should be adding new tests with ACID table format, would it work with iceberg as well?
There was a problem hiding this comment.
The bug should occur when decorrelating any plan with a Values node. How about adding a unit test in a new class HiveRelDecorrelatorTest instead?
There was a problem hiding this comment.
I agree with Thomas.
I removed this q test and added a unit test focusing on testing HiveRelDecorrelator only because at this point the table and storage format doesn't matter.
zabetak
left a comment
There was a problem hiding this comment.
Overall LGTM, I just have some small questions.
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
Show resolved
Hide resolved
3331420 to
0b5f961
Compare
|
thomasrebele
left a comment
There was a problem hiding this comment.
LGTM with some minor comments.
|
|
||
|
|
||
| @Test | ||
| public void testDecorrelateCorrelateIsRemovedWhenPlanHasEmptyValues() { |
There was a problem hiding this comment.
RelOptUtil.toString(decorrelatedPlan) and RelOptUtil.toString(base) appear repeatedly. Assigning them to a variable saves a few computations.
|
|
||
| Assert.assertFalse("Plan after decorrelation still has correlate: \n" + RelOptUtil.toString(decorrelatedPlan), | ||
| RelOptUtil.toString(decorrelatedPlan).contains("Correlate")); | ||
| Assert.assertTrue("Plan after decorrelation does not contain Values: \n" + RelOptUtil.toString(decorrelatedPlan), |
There was a problem hiding this comment.
I would recommend to dump the before and after plan into a comment. The plans are reasonably small and they help to understand what's going on.
| .union(false) | ||
| .build(); | ||
|
|
||
| Assert.assertTrue("Plan before decorrelation does not contain correlate: \n" + RelOptUtil.toString(base), |
There was a problem hiding this comment.
I was a bit confused by the messages, until I realized that they apply only in the case of an assertion error.
How about reformulating them to
- "Plan before decorrelation must contain a Correlate node"
- "Plan before decorrelation must contain a Values node"
- "Plan after decorrelation may not contain a Correlate node"
- "Plan after decorrelation must still contain a Values node"
zabetak
left a comment
There was a problem hiding this comment.
LGTM! I left a bunch of nits that we don't necessarily need to address. However, it would be nice to treat some of the Sonar issues related to unused and wildcard imports, line length. Feel free to merge it as you see fit.
|
|
||
|
|
||
| @Test | ||
| public void testDecorrelateCorrelateIsRemovedWhenPlanHasEmptyValues() { |
There was a problem hiding this comment.
Since the whole test is about the decorrelator the prefix testDecorrelate is redundant and combined with the rest of the line it makes the test name hard to read.
There was a problem hiding this comment.
You may want to add Union in the test name since empty values could appear in many different places and apparently the decorrelator is not handling everything.
| Assert.assertTrue("Plan before decorrelation does not contain correlate: \n" + RelOptUtil.toString(base), | ||
| RelOptUtil.toString(base).contains("Correlate")); | ||
| Assert.assertTrue("Plan before decorrelation does not contain Values: \n" + RelOptUtil.toString(base), | ||
| RelOptUtil.toString(base).contains("Values")); | ||
|
|
||
| RelNode decorrelatedPlan = HiveRelDecorrelator.decorrelateQuery(base); | ||
|
|
||
| Assert.assertFalse("Plan after decorrelation still has correlate: \n" + RelOptUtil.toString(decorrelatedPlan), | ||
| RelOptUtil.toString(decorrelatedPlan).contains("Correlate")); | ||
| Assert.assertTrue("Plan after decorrelation does not contain Values: \n" + RelOptUtil.toString(decorrelatedPlan), | ||
| RelOptUtil.toString(decorrelatedPlan).contains("Values")); |
There was a problem hiding this comment.
I find the usual (full) plan comparisons which are used when testing rules and transformations easier to follow. They may require more frequent updates but seeing the plan before and after makes the test easier to follow.



What changes were proposed in this pull request?
Remove the special handling of the
Valuesoperator fromHiveRelDecorrelatorWhy are the changes needed?
The specialization returned
null, signaling the decorrelator algorithm to stop and return the original plan. Consequently,Correlateoperators may remain in the plan, which Hive cannot handle.Does this PR introduce any user-facing change?
No.
How was this patch tested?