fix(table): trigger cascade race fixes, polling, workflow column flag#4499
fix(table): trigger cascade race fixes, polling, workflow column flag#4499TheodoreSpeaks merged 8 commits intostagingfrom
Conversation
…lready started Stamps fire in chunks of 20 via Promise.all, so queued writes race with the worker's markWorkflowGroupPickedUp (running). When the late queued stamp landed second it overwrote running, and the cell looked stuck in queued for the rest of the run. Skip the stamp when the same execution is already past queued — the worker's authority wins.
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Reworks Fixes workflow execution edge cases: optimistic run patches for Reviewed by Cursor Bugbot for commit c19ec14. Bugbot is set up for automated code reviews on this repo. Configure here. |
…flow column flag - Polling now refetches only pages that contain in-flight cells instead of every loaded page. Idle pages stay untouched while a cascade runs. - run_column optimistic patch mirrors server eligibility on mode='incomplete': cells with filled outputs no longer flip to queued only to revert seconds later when the server returns 0 triggered. - Hide the Workflow column type behind NEXT_PUBLIC_WORKFLOW_COLUMNS_ENABLED (default false). Existing workflow groups keep rendering.
0e0a55f to
89c758a
Compare
Greptile SummaryThis PR addresses three distinct bugs in the table workflow-column execution path and adds a feature flag to hide the Workflow column type in the new-column dropdown.
Confidence Score: 4/5Safe to merge; the core race fixes in cell-write.ts are well-reasoned and the optimistic patch alignment is correct. The cell-write and optimistic-patch changes are correct and close real bugs. The per-page polling implementation works but the queryKey reference is not memoized: every cache write from a poll tick causes a re-render, which creates a new array reference, which triggers the effect cleanup and re-creates the interval. In normal conditions polling still fires at roughly the right cadence, but the interval resets after each update rather than running continuously. apps/sim/hooks/queries/tables.ts — the queryKey memoization and the missing useMemo import are the only items that need attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Scheduler stamps queued] --> B{Same executionId as current?}
B -- No --> C[isNewQueuedStamp = true bypass stale-worker guard]
B -- Yes --> D{current.status == pending?}
D -- Yes --> E[Allow queued to write over pending]
D -- No --> F[Skip - late stamp cell already progressed]
C --> G[Write queued to DB]
E --> G
F --> H[Return skipped]
subgraph Per-page polling loop
I[setInterval tick] --> J{queryClient.isMutating > 0?}
J -- Yes --> K[Skip tick]
J -- No --> L[Walk pages find dirty ones]
L --> M{Any dirty pages?}
M -- No --> N[No-op]
M -- Yes --> O[fetchTableRows for each dirty page]
O --> P[setQueryData splice fresh page]
end
Reviews (1): Last reviewed commit: "fix(table): per-page polling, optimistic..." | Re-trigger Greptile |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b23ba1b. Configure here.
| if (runMode === 'incomplete') { | ||
| const group = groupsById.get(groupId) | ||
| if (group && areOutputsFilled(group, r)) continue | ||
| } |
There was a problem hiding this comment.
Optimistic patch diverges from server eligibility check
Medium Severity
The client optimistic patch for mode='incomplete' skips cells solely based on areOutputsFilled(group, r), but the server's classifyEligibility skips only when status === 'completed' && areOutputsFilled(group, row) (the completedAndFilled variable). For cancelled or error cells with leftover output values, the client incorrectly skips them (no "queued" feedback) even though the server considers them eligible and will run them. The comment claims it "mirrors server eligibility" but it's missing the exec?.status === 'completed' condition the server requires.
Reviewed by Cursor Bugbot for commit b23ba1b. Configure here.
Large fan-outs (thousands of rows) issue sequential trigger.dev HTTP calls inside scheduleRunsForRows.batchEnqueue. Awaiting that loop held the HTTP response (and the AI tool span) open for ~5 min on a 6k-row table — the user saw an 11-min "running" because the tool didn't return until every job had been enqueued. Run the dispatcher in the background and return immediately; contract response now reports `triggered: null` since the count isn't known synchronously.
Each poll tick brings back a fresh page from the server with all-new row objects, even though most rows haven't changed. setQueryData was replacing the whole page reference, which made every memoized <DataRow> in the page re-render every 1.5s. Now we shallow-compare each fresh row against the cached one and reuse the cached reference when nothing changed; only rows whose data or exec status actually flipped re-render.
Refactoring eligibility into classifyEligibility split the runnable answer into two reasons: 'eligible' (deps satisfied) and 'manual-bypass' (autoRun= false group on a manual run, deps don't apply). isGroupEligible only treated 'eligible' as runnable, so a manual "Run all rows" on a single autoRun=false group filtered every row out and returned triggered: 0. Cells flashed queued from the optimistic patch then went empty when the refetch landed.


Summary
queuedstamp overwriting worker'srunning/completedfor the same execution — cells were getting stuck at queued forever.useInfiniteTableRows: refetch only pages that contain in-flight cells instead of every loaded page.run_columnoptimistic patch onmode='incomplete'mirrors server eligibility (areOutputsFilled) so cells with leftover values from a prior cancelled/error run no longer flash queued.NEXT_PUBLIC_WORKFLOW_COLUMNS_ENABLED(default false). Existing workflow groups still render.Type of Change
Testing
Tested manually. Lint and `bun run check:api-validation:strict` pass.
Checklist