perf(nft_transfers): convert cross-chain model to view, optimize chain-level merges#9536
Open
perf(nft_transfers): convert cross-chain model to view, optimize chain-level merges#9536
Conversation
…n-level merges Major changes to address worker node crashes during nft.transfers hourly runs: 1. Convert nft_transfers cross-chain union from incremental merge to view - Follows sector spell architecture: cross-chain unions should be views - Eliminates the massive 21-chain merge into a single delta table - Heavy lifting already done at chain level; view just unions them 2. Add block_month to chain-level unique_key for merge partition pruning - Tables are partitioned by block_month but it was missing from unique_key - Without it, Trino must scan all partitions during merge - With it, merge only touches partitions containing new data 3. Fix arbitrum hardcoded 7-day incremental predicate - Was using raw SQL predicate instead of incremental_predicate() macro - Now uses the standard macro for consistent behavior across environments 4. Clean up nft_transfers() macro - Remove unnecessary SELECT * FROM(...) wrapper (extra projection node) - Remove unnecessary DISTINCT from batch unnest subquery - Fix Jinja whitespace control for clean compiled SQL Co-authored-by: jeff-dude <jeff-dude@users.noreply.github.com>
…optimization-ea05 Made-with: Cursor
…TINCT
Addresses review feedback on the nft.transfers optimization PR:
1. Re-add cross-chain unique_combination_of_columns test including blockchain
- Guards against duplicate emission across the 21-chain view
- Uses 7-day window to keep the test cheap on a view
- combination: (blockchain, block_month, tx_hash, evt_index, token_id, amount)
2. Restore DISTINCT on the ERC1155 batch unnest subquery
- Protects --full-refresh path from pathological TransferBatch events
that duplicate (id, value) positions within a single event
- Incremental merge unique_key already dedupes, so steady-state unchanged
Made-with: Cursor
Temporary CI-only change: replace `is_incremental()` with `is_incremental() or true` in the nft_transfers() macro so that the incremental_predicate time filter is applied on every build, including the initial CI full-refresh. This keeps the CI build window short (~3 days per the macro default) across all 21 chain-level transfers tables, instead of scanning full history on first build. REVERT BEFORE MERGE. Bump ci-stamp on cross-chain nft_transfers to 2. Made-with: Cursor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Debugging
or trueforces incremental predicate on full refreshes- Removed the
or truedebugging artifact from all six{% if is_incremental() or true %}guards so the incremental predicate is again skipped during full refreshes and initial builds.
- Removed the
Or push these changes by commenting:
@cursor push 37df1bdf24
Preview (37df1bdf24)
diff --git a/dbt_subprojects/hourly_spellbook/macros/_sector/nft/nft_transfers.sql b/dbt_subprojects/hourly_spellbook/macros/_sector/nft/nft_transfers.sql
--- a/dbt_subprojects/hourly_spellbook/macros/_sector/nft/nft_transfers.sql
+++ b/dbt_subprojects/hourly_spellbook/macros/_sector/nft/nft_transfers.sql
@@ -27,11 +27,11 @@
{% if denormalized == False -%}
INNER JOIN {{ base_transactions }} et ON et.block_number = t.evt_block_number
AND et.hash = t.evt_tx_hash
- {% if is_incremental() or true -%}
+ {% if is_incremental() -%}
AND {{incremental_predicate('et.block_time')}}
{% endif -%}
{%- endif -%}
-{% if is_incremental() or true -%}
+{% if is_incremental() -%}
WHERE {{incremental_predicate('t.evt_block_time')}}
{% endif -%}
@@ -61,11 +61,11 @@
{% if denormalized == False -%}
INNER JOIN {{ base_transactions }} et ON et.block_number = t.evt_block_number
AND et.hash = t.evt_tx_hash
- {% if is_incremental() or true -%}
+ {% if is_incremental() -%}
AND {{incremental_predicate('et.block_time')}}
{% endif -%}
{%- endif -%}
-{% if is_incremental() or true -%}
+{% if is_incremental() -%}
WHERE {{incremental_predicate('t.evt_block_time')}}
{% endif -%}
@@ -96,14 +96,14 @@
, value, id
FROM {{ erc1155_batch }} t
CROSS JOIN unnest(zip(t."values", t.ids)) AS foo(value, id)
- {% if is_incremental() or true -%}
+ {% if is_incremental() -%}
WHERE {{incremental_predicate('t.evt_block_time')}}
{% endif -%}
) t
{%- if denormalized == False %}
INNER JOIN {{ base_transactions }} et ON et.block_number = t.evt_block_number
AND et.hash = t.evt_tx_hash
- {% if is_incremental() or true -%}
+ {% if is_incremental() -%}
AND {{incremental_predicate('et.block_time')}}
{% endif -%}
{%- endif %}You can send follow-ups to the cloud agent here.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 1b3ec08. Configure here.
Reverts the `or true` clauses added in 1b3ec08 now that CI has validated the lineage. Macro returns to standard `{% if is_incremental() -%}` so full-refresh runs scan full history as intended and incremental runs apply the predicate. Bump ci-stamp to 3 to ensure CI rebuilds against the reverted body. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Thank you for contributing to Spellbook 🪄
Description:
Runtime optimization for
nft.transfersto address repeated worker node crashes during hourly runs (see: Slack #team-curated-internal).Root cause: The cross-chain
nft_transfersmodel was materialized asincrementalwithmergestrategy, unioning 21 chains into a single delta table. This is a massive merge operation that overwhelms worker node memory.Changes:
Convert
nft_transferscross-chain union fromincrementaltoviewunique_key,partition_by,incremental_strategy,incremental_predicates, andfile_formatconfig (not needed for views)Add
block_monthto all 21 chain-levelunique_keyconfigsblock_monthbut it was missing fromunique_keyFix
nft_arbitrum_transfershardcoded incremental predicate'DBT_INTERNAL_DEST.block_time >= date_trunc(\'day\', now() - interval \'7\' day)'instead of theincremental_predicate()macroClean up
nft_transfers()macroSELECT * FROM(...)wrapper — saves an extra projection node in the query planDISTINCTfrom batch unnest subquery —CROSS JOIN unnest(zip(...))already produces unique rows per source eventCompiled & validated — all models compile successfully (
dbt compile).quick links for more information:
Slack Thread