DRAFT: RFC for DO tracing spans #27599

vy-ton · 2026-01-13T16:56:06Z

Draft PR captures proposal for all DO related spans and attributes we should have for public traces.

Assume tracing context propagation exists
Ideally docs could be generated from code for public spans/attributes

github-actions · 2026-01-13T16:57:04Z

This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:

Pattern	Owners
`/src/content/docs/workers/observability/`	`@irvinebroque`, `@mikenomitch`, `@nevikashah`, `@cloudflare/pcx-technical-writing`

github-actions · 2026-01-13T17:18:09Z

Preview URL: https://08b20790.preview.developers.cloudflare.com
Preview Branch URL: https://do-spans.preview.developers.cloudflare.com

Files with changes (up to 15)

Original Link	Updated Link
https://developers.cloudflare.com/workers/observability/traces/spans-and-attributes/	https://do-spans.preview.developers.cloudflare.com/workers/observability/traces/spans-and-attributes/

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

joshthoward · 2026-01-13T18:14:11Z

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

+- `cloudflare.durable_object.response.rows_read`
+- `cloudflare.durable_object.response.rows_written`
+- `cloudflare.durable_object.response.bytes_written`
+- `cloudflare.durable_object.response.sql_duration_ms`


We need to understand whether this will always be 0 or not because it is a synchronous operation. If it's always 0, then I do not think we should include it.

cc @jmorrell-cloudflare

joshthoward · 2026-01-13T18:15:28Z

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

 #### `durable_object_subrequest`

+- `cloudflare.durable_object.startup_duration_ms`
+- `cloudflare.durable_object.constructor_invoked`


Maybe it's also worth adding constructor_time_ms? This would depend on the time resolution that we can get for potentially synchronous operations. Some constructors might make outbound requests and this would end up being very useful.

Do you want both? or does constructor_tims_ms=0 indicate the constructor did not run

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

shrima-cf · 2026-01-13T21:06:31Z

How should I read this PR? Is the goal just to list all the spans available? Will there be documentation explaining what the span means and expected duration for it? (similar to what we have in https://gitlab.cfdata.org/cloudflare/ew/edgeworker/-/blob/master/src/edgeworker/scheduling/jaeger-spans.c%2B%2B ?

vy-ton · 2026-01-13T21:46:11Z

How should I read this PR? Is the goal just to list all the spans available? Will there be documentation explaining what the span means and expected duration for it? (similar to what we have in https://gitlab.cfdata.org/cloudflare/ew/edgeworker/-/blob/master/src/edgeworker/scheduling/jaeger-spans.c%2B%2B ?

@shrima-cf We will certainly add explanation as part of releasing. I'd love to have all spans/attributes generated from code somehow to avoid syncing issues.

Right now, read this PR as capturing all the DO related spans and attributes we want to add

justin-mp · 2026-01-15T09:43:15Z

I find it odd that we're adding timing attributes to spans because spans themselves are supposed to represent the time it took something to run.

In particular, I think things that are in the programming model that the programmer has control over should be spans. In particular, constructor_time_ms, sql_duration_ms, and output_gate_lock_held should really be spans (which would then have different names). Generally when you're doing performance engineering, you need to go down to the primitives, and these are the primitives we give users. I could argue that queue time and the like could also be spans. Then a user knows it's blocked because of queuing, which might be due to load.

Additionally, things like CPU time are OK as an attribute because that's an orthogonal dimension than you get out of a span, but having wall time as an attribute makes no sense as that's exactly what the span's duration measures.

jmorrell-cloudflare · 2026-01-15T15:09:22Z

I find it odd that we're adding timing attributes to spans because spans themselves are supposed to represent the time it took something to run.
having wall time as an attribute makes no sense as that's exactly what the span's duration measures.

@justin-mp I pretty strongly disagree with those assertions. Creating many spans for every possible timing is one of the most common tracing anti-patterns IMO. It makes querying across many operations far harder than it needs to be, and complicates the waterfall visualization unnecessarily. You need to design your data for how you want to query it.

Generally when you're doing performance engineering, you need to go down to the primitives, and these are the primitives we give users.

Attributes are also a core part of the span model, and there is no reason you can't put timing information there. You should think of spans as capturing all of the information about a specific operation. For performance engineering profiling is usually the better tool.

I address this attributes-vs-child-spans tradeoff in my guest chapter in Observability Engineering:

Let's say in your system you've just shipped a new subsystem that prioritizes payload parsing for enterprise users, and you want to see the impact of that change on tail latencies across all of the regions where you have systems deployed. If all of those attributes are present on the wide event, then this is straightforward:

SELECT
  P99(payload_parse.duration_ms)
WHERE
  main = true AND
  service.name = "api-service"
GROUP BY
  user.type,
  cloud.region

However this may seem like we are duplicating data available in the child spans. Surely we can accomplish the same thing by wrapping the payload parsing method in its own span?
This is a valid approach, but now if we want to query that data alongside any other data that we've captured in our wide event, querying has become much more complicated, forcing the use of JOINs:

SELECT
  P99(parse_span.duration_ms)
FROM spans AS main_span
JOIN spans AS parse_span ON main_span.trace_id = parse_span.trace_id
WHERE
  main_span.main = true AND
  main_span.service.name = "api-service" AND
  parse_span.name = "payload-parse"
GROUP BY main_span.user.type, main_span.cloud.region

Mature observability tooling is capable of running these types of queries at the cost of additional processing, however we should keep in mind how we will be using this data. We want to prioritize quick, iterative exploration, often while responding to active incidents, and we want any engineer on our team to be able to easily navigate our observability tooling.
Viewed through this lens, identifying a few important timings and adding them to the wide event is well worth the slight data duplication.

lambrospetrou · 2026-01-15T17:50:54Z

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

+- `cloudflare.durable_object.response.rows_read`
+- `cloudflare.durable_object.response.rows_written`
+- `cloudflare.durable_object.response.bytes_written`


Are these added after the returned cursor is iterated, so after the operation function returns, or before the cursor is iterated hence it will be zero for rows_read?

Similarly, could we also have "bytes_read"?

Are these added after the returned cursor is iterated, so after the operation function returns, or before the cursor is iterated hence it will be zero for rows_read?

For usefulness, I would expect after the cursor is iterated. But we need to confirm.

Added bytes_read

shrima-cf · 2026-01-16T20:58:21Z

@vy-ton In addition to spans, there are also tags that hold useful information, Alex added a couple for the input/output gate spans - cloudflare/workerd#5827
Are you planning on adding these to the documentation as well?

vy-ton · 2026-01-22T20:18:28Z

@vy-ton In addition to spans, there are also tags that hold useful information, Alex added a couple for the input/output gate spans - cloudflare/workerd#5827 Are you planning on adding these to the documentation as well?

I would not expose the span attributes in cloudflare/workerd#5827 - those seem pretty internal-only to me.

vy-ton · 2026-01-22T20:19:42Z

src/content/docs/workers/observability/traces/spans-and-attributes.mdx

+- `cloudflare.durable_object.output_gate_lock_hold_ms`
+- `cloudflare.durable_object.output_gate_lock_wait_ms`
+
+#### For handlers[/workers/observability/traces/spans-and-attributes/#handlers] invoked on a Durable Object such as RPC or fetch(), these attributes exist:


@lambrospetrou you mentioned having primary/replica context, what other attributes would you expect here?

DO tracing spans

a3f2d20

github-actions bot added the size/s label Jan 13, 2026

github-actions bot assigned irvinebroque, mikenomitch and nevikashah Jan 13, 2026

github-actions bot added the product:workers Related to Workers product label Jan 13, 2026

RPC tracing

e542e47

joshthoward reviewed Jan 13, 2026

View reviewed changes

src/content/docs/workers/observability/traces/spans-and-attributes.mdx Outdated Show resolved Hide resolved

joshthoward reviewed Jan 13, 2026

View reviewed changes

src/content/docs/workers/observability/traces/spans-and-attributes.mdx Outdated Show resolved Hide resolved

joshthoward reviewed Jan 13, 2026

View reviewed changes

vy-ton requested a review from jmorrell-cloudflare January 13, 2026 18:55

vy-ton commented Jan 13, 2026

View reviewed changes

src/content/docs/workers/observability/traces/spans-and-attributes.mdx Show resolved Hide resolved

vy-ton added 2 commits January 13, 2026 15:14

address Josh's feedback

db8bc2a

clarify sql vs kv

2042ac0

lambrospetrou reviewed Jan 15, 2026

View reviewed changes

address initial comments

08b2079

vy-ton commented Jan 22, 2026

View reviewed changes

vy-ton changed the title ~~DRAFT: DO tracing spans~~ DRAFT: RFC for DO tracing spans Jan 22, 2026

DRAFT: RFC for DO tracing spans #27599

Are you sure you want to change the base?

DRAFT: RFC for DO tracing spans #27599

Uh oh!

Conversation

vy-ton commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joshthoward Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

vy-ton Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

joshthoward Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

vy-ton Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shrima-cf commented Jan 13, 2026

Uh oh!

vy-ton commented Jan 13, 2026

Uh oh!

justin-mp commented Jan 15, 2026

Uh oh!

jmorrell-cloudflare commented Jan 15, 2026

Uh oh!

lambrospetrou Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

vy-ton Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

shrima-cf commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vy-ton commented Jan 22, 2026

Uh oh!

vy-ton Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

vy-ton commented Jan 13, 2026 •

edited

Loading

github-actions bot commented Jan 13, 2026 •

edited

Loading

shrima-cf commented Jan 16, 2026 •

edited

Loading