add schema change test for variant and inverted index#61746
add schema change test for variant and inverted index#61746eldenmoon wants to merge 1 commit intoapache:masterfrom
Conversation
|
run buildall |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Adds a new cloud-only regression test to validate schema change behavior for VARIANT columns when multiple inverted indexes exist, and verifies that inverted index files and bloom filter settings remain usable after schema changes.
Changes:
- Introduce
test_variant_multi_index_schema_changeregression suite (cloud mode only). - Create/load a VARIANT table with multiple inverted indexes, then run schema changes (
MODIFY COLUMNnullability;SET bloom_filter_columns). - Assert query correctness (MATCH_PHRASE and bloom-filtered predicates) and verify nested index files via BE HTTP API.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| String backendId = tablets[0].BackendId | ||
| String ip = backendIdToBackendIP.get(backendId) | ||
| String port = backendIdToBackendHttpPort.get(backendId) | ||
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") |
There was a problem hiding this comment.
check_nested_index_file(...) hard-codes index storage format as "V2", but the table DDL does not set "inverted_index_storage_format" in PROPERTIES. This makes the assertion dependent on cluster defaults (and could flip to V3/V1), causing test flakiness. Consider setting "inverted_index_storage_format" = "V2" explicitly in the CREATE TABLE PROPERTIES (or loosening the assertion to accept the actual format returned by the API).
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") | |
| check_nested_index_file(ip, port, tabletId, 2, 3, null) |
| String backendId = tablets[0].BackendId | ||
| String ip = backendIdToBackendIP.get(backendId) | ||
| String port = backendIdToBackendHttpPort.get(backendId) | ||
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") |
There was a problem hiding this comment.
check_nested_index_file(ip, port, tabletId, 2, 3, ...) assumes there will be exactly 2 rowsets after trigger_and_wait_compaction(tableName, "cumulative"). Rowset count after compaction is environment/strategy dependent, and other regression tests compute activeRowsetCount from be_show_tablet_status rather than hard-coding it. To avoid flaky failures, consider deriving the expected rowset count dynamically (or asserting a range / invariant like "< before"), then pass that value into check_nested_index_file.
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") | |
| // Derive the active rowset count dynamically to avoid hard-coded assumptions | |
| def tabletStatusList = be_show_tablet_status(tableName) | |
| int activeRowsetCount = 0 | |
| for (status in tabletStatusList) { | |
| if (status.TabletId == tabletId) { | |
| // Use the rowset count reported by the backend for this tablet | |
| activeRowsetCount = (status.RowsetCount as int) | |
| break | |
| } | |
| } | |
| assertTrue(activeRowsetCount > 0) | |
| check_nested_index_file(ip, port, tabletId, activeRowsetCount, 3, "V2") |
| String backendId = tablets[0].BackendId | ||
| String ip = backendIdToBackendIP.get(backendId) | ||
| String port = backendIdToBackendHttpPort.get(backendId) | ||
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") |
There was a problem hiding this comment.
The expected per-segment indices_count is set to 3, but the table defines two inverted indexes on var and the inserted JSON has two extracted paths (string, array_string). A very similar variant multi-index test asserts indices_count = 4 for this shape (2 indexes × 2 paths). Unless there is a documented reason this schema change produces only 3 indices, this expectation is likely incorrect and will fail. Consider aligning the expected indices_count with the actual number of built indices (e.g., 4) or deriving it from SHOW INDEX/returned JSON instead of hard-coding.
| check_nested_index_file(ip, port, tabletId, 2, 3, "V2") | |
| check_nested_index_file(ip, port, tabletId, 2, 4, "V2") |
|
run buildall |
|
run buildall |
1 similar comment
|
run buildall |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)