[FEATURE] Add Union command in PPL #5240
[FEATURE] Add Union command in PPL #5240srikanthpadakanti wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit 9ec4314)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 9ec4314 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 3782336
Suggestions up to commit f05fbdb
Suggestions up to commit 60613ec
Suggestions up to commit 0ff52aa
Suggestions up to commit 12b2eb5
|
7ab57cb to
5fd5e1c
Compare
|
Persistent review updated to latest commit 5fd5e1c |
dcbc60f to
12b2eb5
Compare
|
Persistent review updated to latest commit dcbc60f |
|
Persistent review updated to latest commit 12b2eb5 |
12b2eb5 to
0ff52aa
Compare
|
Persistent review updated to latest commit 0ff52aa |
|
Hello @anasalkouz @mengweieric Please enforce the PR label and review this. Thanks Also, the failing check sql-cli integration test uses a different Gradle version (8.14.2 vs 9.2.0). I believe this has nothing to do with my changes. |
|
Persistent review updated to latest commit 60613ec |
|
Please take a look at the CI failure. |
60613ec to
f05fbdb
Compare
|
Persistent review updated to latest commit f05fbdb |
Took care of it. Please review. Thanks. |
f05fbdb to
3782336
Compare
|
Persistent review updated to latest commit 3782336 |
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com> # Conflicts: # integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java # integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java
3782336 to
9ec4314
Compare
|
Persistent review updated to latest commit 9ec4314 |
|
Hello @mengweieric Can you please review this. |
|
@anasalkouz @mengweieric Can you please review this. |
| | [appendcol command](cmd/appendcol.md) | 3.1 | experimental (since 3.1) | Append the result of a sub-search and attach it alongside the input search results. | | ||
| | [lookup command](cmd/lookup.md) | 3.0 | experimental (since 3.0) | Add or replace data from a lookup index. | | ||
| | [multisearch command](cmd/multisearch.md) | 3.4 | experimental (since 3.4) | Execute multiple search queries and combine their results. | | ||
| | [union command](cmd/union.md) | 3.5 | experimental (since 3.5) | Combine results from multiple datasets using UNION ALL semantics. | |
There was a problem hiding this comment.
We should change it to since 3.7 since 3.6 has been released
| APPEND: 'APPEND'; | ||
| MULTISEARCH: 'MULTISEARCH'; | ||
| UNION: 'UNION'; | ||
| MAXOUT: 'MAXOUT'; |
There was a problem hiding this comment.
MAXOUT is added as a lexer token but not included in searchableKeyWord in the parser grammar. This means any query using maxout as a field name will fail after this PR:
source=t | eval maxout = 1
Other similar option keywords (COUNTFIELD, SHOWCOUNT) are already in searchableKeyWord. We should consider adding | MAXOUT there for consistency
|
|
||
| unionDataset | ||
| : LT_SQR_PRTHS subSearch RT_SQR_PRTHS | ||
| | qualifiedDatasetName |
There was a problem hiding this comment.
qualifiedDatasetName can't handle two common table name patterns:
- Date-suffix (logs-2024.01.01) → lexer produces a single ID_DATE_SUFFIX token, which ident doesn't accept
- Dotted catalog paths (catalog.my_index) → rule only matches the first ident, then fails on DOT (expects COLON)
The existing tableSource handles both. Consider reusing it:
unionDataset
: LT_SQR_PRTHS subSearch RT_SQR_PRTHS
| tableSource
;
|
Please also take a look at the merge conflicts |
Description
Add union command to PPL that implements SQL-style UNION ALL semantics with Calcite-based type coercion. The command supports combining multiple datasets (indices, patterns, aliases, or subsearches) with automatic schema merging and missing fields are filled with NULL, compatible types are coerced to a common supertype (e.g., int+float --> float), and incompatible types fall back to string. Works both as a first command and mid-pipeline, where the upstream result set is implicitly included as the first dataset.
Related Issues
Resolves #5110
#5110
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.