Skip to content

feat: cross-run lineage via reserved run attributes#2153

Draft
rchasman wants to merge 4 commits into
vercel:mainfrom
rchasman:feat/cross-run-lineage-rootid
Draft

feat: cross-run lineage via reserved run attributes#2153
rchasman wants to merge 4 commits into
vercel:mainfrom
rchasman:feat/cross-run-lineage-rootid

Conversation

@rchasman
Copy link
Copy Markdown

@rchasman rchasman commented May 29, 2026

Problem

Workflow has no way to relate one run to another. ListWorkflowRunsParams filters by workflowName and status; a run carries no lineage. That's fine for one-shot runs but breaks every multi-run pattern the SDK already encourages:

  • A daisy-chain (a workflow that starts its successor — the documented cron/background pattern) becomes N unrelatable runs. You can cancel one tick, not "the cron"; you can't list the chain. The thing the cron RFC (Crons #1649) calls out — "track active crons and invocations, cancel them from the UI, IDs like cron_abc123" — can't be built on a library alone, because the runs aren't grouped.
  • Same gap for fan-out, retry-as-a-fresh-run, and deploymentId: 'latest' hops — they all orphan.

Solution

Record lineage as reserved run attributes ($rootId, $parentRunId) and make attributes filterable in list(). This follows @pranaygp's suggestion on #1649 to reuse the attributes mechanism (#2088) rather than add a dedicated field — the only missing piece was queryability.

  • A run started with no parent is its own root ($rootId === runId).
  • A run started from inside another run inherits the parent's $rootId, so a chain of any depth stays flat under one root. start() reads the parent from the ambient step context (it's a 'use step').
  • A lineage is a querylist({ attributes: { $rootId } }) — not a new entity.

Design: stays run-centric

Workflow is run-centric — event log, observability, cancellation, version pinning, and IDs are all per-run; the run is the atom (the run-listed UI is a symptom of this). Lineage is deliberately not a new primitive above the run:

  • It's attributes on a run; a lineage is a query, not an entity. Nothing new enters the model.
  • Each tick keeps its own event log, status, id, and version pin. "Cancel the cron" = cancel the runs in the set; there's no cron object. cron_abc123 is just the root run's id.
  • Because lineage rides on attributes — which are already preserved across lifecycle updates — it survives started/completed/failed/cancelled with no extra handling.

Presentation follows and stays open-ended. A run with no lineage is unaffected, so the existing runs view is unchanged; if a consumer ignores the attributes, nothing breaks. The view can optionally collapse a lineage and filter by $rootId. A tree view can come later for free and also stays run-centric: $parentRunId is already recorded for the edge, so a tree just renders the same runs hierarchically — no containing "parent run" entity, just runs pointing at runs. Flat ships today with zero migration; nested timelines layer on whenever the UI wants them. Flat also keeps grouping O(1) by one id rather than an N-hop parent walk.

What's included

  • @workflow/world: attributes on CreateWorkflowRunRequest, the run_created and resilient run_started event data, and queue RunInput; an attributes filter on ListWorkflowRunsParams.
  • @workflow/core: start() resolves $rootId/$parentRunId from the ambient step context and merges any caller-provided attributes.
  • @workflow/world-local: apply attributes at creation; list({ attributes }).
  • @workflow/world-testing: an end-to-end daisy-chain test.

Testing

  • Unit: propagation (own-root / inherit / deep-chain / caller-merge) and storage + filter + lifecycle preservation.
  • E2E: a real daisy-chain through the runtime + queue + world-local groups under one $rootId, listable via the attribute filter, with an unrelated run excluded.
  • Existing core (start) and world-local (storage, attributes) suites unaffected; affected build tree compiles clean.

Why attributes, on parent vs root

$parentRunId alone (the immediate edge) needs an N-hop walk to group a lineage. A reserved $rootId gives O(1) grouping in one list() filter — a run inherits its parent's $rootId, a root's is itself. Both are recorded: $parentRunId for the tree edge, $rootId for the group key.

Scope and caveats

A working prototype opened for design direction, not a finished feature:

  1. world-local only. The shared schema and start() are done, but persistence/filter is wired only in @workflow/world-local. world-vercel and world-postgres have their own run-creation, lifecycle, and list paths that each need the same wiring.
  2. Extra read on the hot path. start() reads the parent's $rootId with one world.runs.get per nested start. The zero-I/O shape is to thread the root through the step context (WorkflowMetadata + step invoke payload). Left as a documented follow-up.
  3. Design fork. Reserved attributes is one shape (this PR); a dedicated column is another. Went with attributes per the Crons #1649 discussion.
  4. Versioning across deployments for long-lived lineages (pinning vs deploymentId: 'latest' per hop) is a real RFC question, not decided here.

A changeset is included (minor for the three touched packages, world-vercel/postgres gap noted).

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 29, 2026

🦋 Changeset detected

Latest commit: e0b62e9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 21 packages
Name Type
@workflow/world Minor
@workflow/core Minor
@workflow/world-local Minor
@workflow/cli Patch
@workflow/vitest Patch
@workflow/web-shared Patch
@workflow/web Patch
@workflow/world-postgres Patch
@workflow/world-testing Patch
@workflow/world-vercel Patch
@workflow/builders Patch
@workflow/next Patch
@workflow/nitro Patch
workflow Minor
@workflow/astro Patch
@workflow/nest Patch
@workflow/rollup Patch
@workflow/sveltekit Patch
@workflow/vite Patch
@workflow/nuxt Patch
@workflow/ai Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 29, 2026

@rchasman is attempting to deploy a commit to the Vercel Labs Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Copy Markdown
Contributor

@vercel vercel Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Suggestion:

In the resilient-start path, rootId is omitted from the run_started event's eventData, causing cross-run lineage to be lost when run_created fails and the run is bootstrapped from the queue.

Fix on Vercel

@pranaygp
Copy link
Copy Markdown
Contributor

since Run is serialized, returning a run already gets handled in observability and you can click the run ID to go straight to the next run ID

now that #2088 has been shipped in v5 too, you can also use attributes to have parent and child runs link to each other by setting the run IDs as you want. We could have a reserved $parentRunId attribute too for lineage tracking

wdyt @rchasman ?

rchasman added 4 commits May 28, 2026 22:07
Relate runs without a new primitive: `start()` records lineage as reserved
run attributes (`$rootId`, `$parentRunId`) rather than a dedicated column.
A run started with no parent is its own root (`$rootId === runId`); a run
started from inside another run inherits the parent's `$rootId`, so a chain
of any depth stays flat under one root. `start()` reads the parent from the
ambient step context.

- world: `attributes` on `CreateWorkflowRunRequest`, the run_created and
  resilient run_started event data, and queue `RunInput`; an `attributes`
  filter on `ListWorkflowRunsParams` so a lineage is queryable.
- core: `start()` resolves `$rootId`/`$parentRunId` and merges any
  caller-provided attributes on top.

A lineage is a query (`list({ attributes: { $rootId } })`), not an entity,
so the model stays run-centric: a cron is just a workflow that starts its
own successor, and `cron_abc123` is the root run's id.

Signed-off-by: Roey D. Chasman <rchasman@gmail.com>
…ibutes

Apply attributes provided at run creation (run_created and the resilient
run_started path), and support `list({ attributes })`, matching runs whose
attributes contain every requested key/value. Because lineage rides on
attributes — which are already preserved across lifecycle updates — it
survives started/completed/failed/cancelled with no extra handling.

Signed-off-by: Roey D. Chasman <rchasman@gmail.com>
Add a daisy-chain fixture (each tick starts its successor), a `/runs`
endpoint that filters by the `$rootId` attribute plus a `listRuns` fetcher,
and an e2e test asserting the chain runs through the real runtime and queue
and groups under one lineage, with an unrelated run excluded.

Signed-off-by: Roey D. Chasman <rchasman@gmail.com>
world-vercel and world-postgres are intentionally out of scope here.

Signed-off-by: Roey D. Chasman <rchasman@gmail.com>
@rchasman rchasman force-pushed the feat/cross-run-lineage-rootid branch from bb30012 to e0b62e9 Compare May 29, 2026 05:07
@rchasman rchasman changed the title feat: cross-run lineage via propagated rootId feat: cross-run lineage via reserved run attributes May 29, 2026
@rchasman
Copy link
Copy Markdown
Author

Re-spun onto attributes per your suggestion. start() records $rootId and $parentRunId as reserved attributes (#2088); the one addition that makes it useful is an attributes filter on ListWorkflowRunsParams, so a lineage is queryable: list({ attributes: { $rootId } }).

Good call.. Going through attributes turned out cleaner than the column: lineage is preserved across lifecycle updates for free (attributes already carry forward), and the same filter generalizes to any attribute, not just lineage.

I record both $parentRunId (the edge you mentioned, and the basis for a tree view later) and $rootId (the group key, so grouping is one filter instead of an N-hop walk).

Also fixed the resilient-start path the review bot flagged: attributes now flow through run_started too.

Proven end-to-end with a real daisy-chain; unit + e2e green, existing suites unaffected. Still world-local only and one runs.get per nested start that could thread through the step context for zero I/O, both noted in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants