Skip to content

OpenTelemetry: metrics for key lookup and remote document fetches#771

Merged
dahlia merged 16 commits into
fedify-dev:mainfrom
dahlia:otel/key-remote-doc-fetches
May 20, 2026
Merged

OpenTelemetry: metrics for key lookup and remote document fetches#771
dahlia merged 16 commits into
fedify-dev:mainfrom
dahlia:otel/key-remote-doc-fetches

Conversation

@dahlia
Copy link
Copy Markdown
Member

@dahlia dahlia commented May 18, 2026

Summary

Resolves #738 (sub-task of #316). Fedify already produces spans (activitypub.fetch_key, activitypub.fetch_document, activitypub.lookup_object) and the signature-scoped activitypub.signature.key_fetch.duration histogram, but operators cannot answer aggregate questions like “how often are key lookups served from cache?”, “which remote host is responsible for slow document fetches?”, or “what share of lookupObject() calls resolve to an Actor?” with the existing telemetry. This PR adds six new instruments and the wiring to emit them across Fedify's loader and lookup paths.

New instruments, all defined on FederationMetrics in packages/fedify/src/federation/metrics.ts except activitypub.object.lookup, which lives in packages/vocab/src/lookup.ts to avoid a circular dependency on @fedify/fedify:

  • activitypub.key.lookup (counter) and activitypub.key.lookup.duration (Histogram) cover every fetchKey()/fetchKeyDetailed() call, including the three signature verification paths. The shared classifyFetchError() classifier maps outcomes to hit, fetched, not_found (with the HTTP status code), invalid (a successful HTTP response that did not parse into a usable key), network_error, and error. The existing activitypub.signature.key_fetch.duration histogram stays in place; it keeps the activitypub.signature.kind dimension that is the right way to slice signature verification latency by HTTP/Linked Data/Object Integrity.
  • activitypub.document.fetch (counter) and activitypub.document.fetch.duration (Histogram) cover every Fedify-wrapped document or context loader invocation, including the authenticated loader. The instrumentDocumentLoader() wrapper records the requested URL's hostname only, runs failures through classifyFetchError() (which now also recognises a bare TypeError and AbortError as network_error, the shapes native fetch() raises on DNS/connect/TLS/abort failures before any response is observed), and rethrows the original error so caller behavior is unchanged.
  • activitypub.document.cache (counter) records hit/miss from inside kvCache() itself, where the cache decision is actually made. Preloaded JSON-LD contexts and rule-less calls do not consult the KV cache and therefore do not emit a measurement.
  • activitypub.object.lookup (counter) records every lookupObject() call, classified as actor, object, or other (the last covers both null resolutions and thrown failures, since the measurement is emitted in a finally block). It has no companion duration histogram on purpose: the underlying document loader already records activitypub.document.fetch.duration and a second duration here would double-count latency for the same wire-level fetch.

All instruments share a bounded activitypub.lookup.kind and activitypub.lookup.result attribute taxonomy, plus activitypub.remote.host (hostname only, never the full URL or path), http.response.status_code (only when an HTTP response was observed), and activitypub.cache.enabled where Fedify can confidently report cache presence. The TypeScript types LookupKind, LookupResult, DocumentFetchKind, DocumentFetchResult, and KeyLookupResult enforce the bounded value sets at compile time, so impossible combinations like recordDocumentFetch with kind: "public_key" fail in the type-checker rather than landing as a stray attribute series at runtime. The host-extraction helper for lookupObject() is strict: it parses URLs, handles acct:user@host URIs in both string and URL forms, and bare @user@host/user@host handles, but rejects anything that contains path/query/fragment/whitespace separators so adversarial inputs cannot inflate cardinality on activitypub.remote.host.

Key IDs, actor IDs, object IDs, JSON-LD context URLs, full URLs, and fediverse handles are deliberately excluded from every new metric attribute. They remain on the corresponding spans for trace-level investigation.

The document and context loader wrappers short-circuit to the inner loader when no meterProvider is configured on createFederation(). That preserves strict reference identity for Context.documentLoader, Context.contextLoader, and the authenticated document loader, so existing tests asserting ctx.documentLoader === userLoader continue to pass. Every other metric, including activitypub.key.lookup and activitypub.object.lookup when reached through Context.lookupObject(), follows Fedify's standard “fall back to the global MeterProvider“ behavior. A > [!NOTE] block in docs/manual/opentelemetry.md documents this distinction so it is not a surprise.

The authenticatedDocumentLoaderFactory wrapper also now forwards the optional second DocumentLoaderFactoryOptions argument so user-supplied authenticated factories keep receiving their per-call allowPrivateAddress/userAgent overrides; a regression test in packages/fedify/src/federation/middleware.test.ts pins that behavior.

Tests live alongside each piece of plumbing: helper-level coverage in packages/fedify/src/federation/metrics.test.ts, cache hit/miss in packages/fedify/src/utils/kv-cache.test.ts, key-lookup outcomes (hit, fetched, 404, TypeError) in packages/fedify/src/sig/key.test.ts, the HTTP signature verifier's meterProvider forwarding in packages/fedify/src/sig/http.test.ts, Federation-level smoke tests in packages/fedify/src/federation/middleware.test.ts, and lookupObject() actor classification (including the strict host extraction for acct: URIs and adversarial handle suffixes) in packages/vocab/src/lookup.test.ts. The metric table, per-metric attribute breakdown, the new note about opt-in vs global-fallback semantics, and the CHANGES.md entry for Version 2.3.0 are all included.

Test plan

  • mise check passes
  • mise test:deno passes
  • mise test:node passes
  • mise run docs:build succeeds against the new docs/manual/opentelemetry.md additions
  • Spot-check on a real application: configure a MeterProvider, drive a fediverse fetch, and confirm the new instruments appear with the documented attributes and bounded values (cache hit/miss on a repeated lookupObject, kind=actor vs kind=object split, not_found row with http.response.status_code on a 404 key fetch)

dahlia added 9 commits May 18, 2026 18:48
Introduce five new OpenTelemetry instruments in FederationMetrics, with
bounded attribute taxonomies so operators get aggregate visibility into
public-key lookups and remote JSON-LD document fetches without exploding
metric cardinality:

  - activitypub.key.lookup (counter)
  - activitypub.key.lookup.duration (histogram)
  - activitypub.document.fetch (counter)
  - activitypub.document.fetch.duration (histogram)
  - activitypub.document.cache (counter)

The taxonomy is encoded as the LookupKind and LookupResult union types,
together with the narrower DocumentFetchKind / DocumentFetchResult /
KeyLookupResult aliases so the type system rejects impossible
combinations (e.g. recordDocumentFetch with kind public_key, or a key
lookup ending in cache miss).  recordKeyLookup, recordDocumentFetch, and
recordDocumentCache take a remoteUrl: URL rather than a host string and
derive activitypub.remote.host through the existing getRemoteHost()
helper, so callers cannot accidentally smuggle key IDs, actor URLs, or
full URLs into a metric attribute.  classifyFetchError centralises the
result classification, mapping FetchError responses to not_found
(404 / 410) or error with their status code and treating FetchError
without a Response, bare TypeError (the shape native fetch() raises on
DNS / connect / TLS failures), and AbortError as network_error.  Counter
and histogram are always emitted together on document fetches so
aggregate rate and latency views stay in sync.

Only the instrument and helper plumbing lands in this commit; call-site
wiring in fetchKey, the document loader, kvCache, and the Federation
middleware follows in subsequent commits.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Extend `kvCache()` with optional `meterProvider` and `kind`
parameters so it can emit one `activitypub.document.cache` measurement
each time it consults the KV store: `result=miss` when it falls through
to the inner loader and `result=hit` when the KV store returns a cached
`RemoteDocument`.  The metric is recorded through `recordDocumentCache()`
from Step 1, which derives `activitypub.remote.host` from a `URL` and
keeps the bounded `activitypub.lookup.kind` / `activitypub.lookup.result`
attribute taxonomy.  `kind` defaults to `"object"` so the generic
document-loader case does not need a new option, and context loaders
opt in by passing `kind: "context"`.  Preloaded-context short-circuits
and rule-less calls do not consult the KV cache and therefore do not
emit a measurement.

The next commit wires this option through the Federation middleware's
document and context loader factories.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Add `instrumentDocumentLoader()` to the federation metrics module so any
`DocumentLoader` Fedify wraps can record one
`activitypub.document.fetch` + one `activitypub.document.fetch.duration`
measurement per call.  The wrapper

  - takes a meter provider, a bounded `DocumentFetchKind`, and an
    optional `cacheEnabled` flag,
  - parses the requested URL once and derives
    `activitypub.remote.host` via `getRemoteHost()`, omitting the
    attribute when the URL does not parse,
  - records `result=fetched` on success and runs
    `classifyFetchError()` on rejection (rethrowing the original
    error), so non-2xx HTTP responses surface as `not_found` or
    `error` with `http.response.status_code`, and native fetch
    failures (`TypeError`, `AbortError`, `FetchError` without a
    response) surface as `network_error`, and
  - short-circuits to the inner loader untouched when no meter
    provider is supplied, so opt-out is the default and applications
    without OpenTelemetry pay nothing.

`classifyFetchError()`'s return type is narrowed from `LookupResult`
to `DocumentFetchResult` since it never returns `hit` or `miss`; that
also makes it directly assignable to both `DocumentFetchAttributes`
and `KeyLookupAttributes`.

The Federation middleware wiring that uses this wrapper follows in
the next commit.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Add an optional `meterProvider` to `FetchKeyOptions` and instrument the
shared `fetchKeyWithResult()` helper so every `fetchKey()` and
`fetchKeyDetailed()` call now records exactly one
`activitypub.key.lookup` + `activitypub.key.lookup.duration`
measurement, classified by outcome:

  - cached valid key or cached negative entry → result=hit
  - fresh fetch + parseable key → result=fetched
  - fresh fetch + no usable key (resolveFetchedKey returns null)
    → result=invalid
  - documentLoader threw → classified via `classifyFetchError()`,
    surfacing `http.response.status_code` on `FetchError` rows and
    mapping 404 / 410 to not_found, FetchError without a response /
    TypeError / AbortError to network_error
  - any unexpected throw upstream → result=error (default outcome
    is never overridden before the failing line)

Each measurement carries `activitypub.lookup.kind=public_key`,
`activitypub.cache.enabled` (true when `options.keyCache` is set), the
key URL's hostname under `activitypub.remote.host`, and the HTTP status
code when an HTTP response was observed.  The outcome is set only on
the successful return paths, so a throw from `onCachedUnavailable` or
`onFetchError` leaves the default `error` outcome rather than
mislabelling a partial failure as `hit`.

The HTTP signature verifier (`sig/http.ts`) had two `fetchKeyDetailed`
call sites — the draft-cavage path and the RFC 9421 path — that only
forwarded `tracerProvider` and not `meterProvider`.  Both now thread
`meterProvider` through, and the verifier-path test asserts the new
`activitypub.key.lookup` measurement lands on the configured test
provider.  The Linked Data Signatures and Object Integrity Proofs
paths already forward the full `options` object, so spreading
`options.meterProvider` is enough there.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Layer the new OpenTelemetry instrumentation onto the loader factories
that `FederationImpl` constructs.  The default `documentLoaderFactory`
now produces a `kvCache()` configured with `meterProvider` and
`kind: "object"`, and a separate default `contextLoaderFactory` does
the same with `kind: "context"`.  Every factory's output is wrapped
in `instrumentDocumentLoader()` so each call records exactly one
`activitypub.document.fetch` plus `activitypub.document.fetch.duration`,
and the `kvCache()` layer emits `activitypub.document.cache` hit/miss.
The authenticated document loader factory is also wrapped, with
`cacheEnabled: false` since it intentionally bypasses the cache.

`activitypub.cache.enabled` is set to `true` only for the built-in
factories where Fedify can guarantee `kvCache()` is in the path; for
user-supplied factories caching is opaque, so the attribute is omitted
rather than recorded as a confident `true`/`false`.  When no
`meterProvider` is configured, `instrumentDocumentLoader()` short-
circuits and returns the inner loader unchanged, so `ctx.documentLoader`
and `ctx.contextLoader` keep strict reference equality with
user-supplied loaders for callers that do not use OpenTelemetry.

The historical fallback where omitting `contextLoaderFactory` reuses
the (possibly user-customised) `documentLoaderFactory` is preserved;
the wrapping just relabels the metric attributes to `kind=context` at
the call site.

The `authenticatedDocumentLoaderFactory` wrapper now also forwards
the optional `DocumentLoaderFactoryOptions` second argument so
user-supplied authenticated factories still receive their per-call
`allowPrivateAddress` / `userAgent` overrides.

Three new tests in `middleware.test.ts` cover:

  - `ctx.documentLoader` records `kind=object` + omitted
    `cache.enabled` when the user supplies a custom factory.
  - `ctx.contextLoader` records `kind=context`.
  - `ctx.documentLoader` and `ctx.contextLoader` are strict-equal to
    the user-supplied loader when no `meterProvider` is configured.

A fourth test verifies that a user-supplied
`authenticatedDocumentLoaderFactory` still receives its second
`DocumentLoaderFactoryOptions` argument.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Record a new `activitypub.object.lookup` counter at every
`lookupObject()` call so operators can see how many lookups resolved
to an Actor, to a non-actor Object, or to nothing — without having
to mix that classification into the loader-level
`activitypub.document.fetch` counter (which would over-count
unique remote fetches).

The counter takes `activitypub.lookup.kind` ∈ `{actor, object, other}`
and an optional `activitypub.remote.host` (hostname only).  The
host-extraction helper handles `URL` instances, plain `https://…`
strings, `acct:user@host` URIs (both `URL` and string forms), and bare
fediverse handles (`@user@host`, `user@host`).  Anything that does not
reduce cleanly to an authority — paths, query strings, fragments, or
whitespace mixed in with the handle suffix — is dropped from the
metric attribute rather than recorded, so adversarial inputs cannot
inflate cardinality on `activitypub.remote.host`.

The metric is opt-in: `lookupObject()` only emits when its caller
passes a `meterProvider`.  `Context.lookupObject()` threads
`this.meterProvider` through so Federation-managed call sites get the
counter automatically; direct callers of `lookupObject()` that do not
set up OpenTelemetry pay nothing.

The instrument lives in `@fedify/vocab` (the same package as
`lookupObject` itself), so the recording site does not introduce a
circular dependency on `@fedify/fedify`.  The counter is created
lazily per `MeterProvider` via a `WeakMap` cache that mirrors the
`getFederationMetrics` pattern.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Extend `docs/manual/opentelemetry.md` with the new instruments
introduced for issue fedify-dev#738.  The "Instrumented metrics" table gains
six rows for `activitypub.key.lookup`, `activitypub.key.lookup.duration`,
`activitypub.document.fetch`, `activitypub.document.fetch.duration`,
`activitypub.document.cache`, and `activitypub.object.lookup`, and
each receives its own attribute subsection.

The prose explains the shared bounded `activitypub.lookup.kind` and
`activitypub.lookup.result` taxonomy, when `activitypub.cache.enabled`
and `http.response.status_code` are recorded, what is deliberately
excluded from the metric attributes for cardinality discipline, and
how the new `activitypub.key.lookup.duration` complements the
signature-scoped `activitypub.signature.key_fetch.duration` histogram
that already covers the HTTP / Linked Data / Object Integrity slice.

`CHANGES.md` gets a 2.3.0 entry under `@fedify/fedify` summarising
the new instruments, the attribute discipline, and the relationship
to the existing signature-scoped key-fetch histogram, with a link
reference to issue fedify-dev#738.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
Add a `[!NOTE]` block to the "Explicit MeterProvider configuration"
section that calls out which metrics are opt-in (only emit when an
explicit `meterProvider` is configured) and which follow the standard
"fall back to the global `MeterProvider`" behavior.

Only `activitypub.document.fetch[.duration]` and
`activitypub.document.cache` are opt-in inside Fedify, because the
`instrumentDocumentLoader()` wrapper short-circuits to the original
loader when no meter provider is supplied so that
`ctx.documentLoader === userLoader` continues to hold and existing
reference-identity tests do not break.

Every other metric — including `activitypub.key.lookup[.duration]`
and `activitypub.object.lookup` when reached through
`Context.lookupObject()` — uses Fedify's standard global-provider
fallback because the recording helpers do not short-circuit and
`Context.meterProvider` resolves to the global default when nothing
was configured.  A trailing sentence clarifies that direct
`lookupObject()` calls from `@fedify/vocab` (bypassing the Federation
context) still require an explicit `LookupObjectOptions.meterProvider`
since they have no Federation-level provider to fall back on.

fedify-dev#316
fedify-dev#738

Assisted-by: Claude Code:claude-opus-4-7
Assisted-by: Codex:gpt-5.5
@dahlia dahlia added this to the Fedify 2.3 milestone May 18, 2026
@dahlia dahlia self-assigned this May 18, 2026
@dahlia dahlia added type/enhancement component/federation Federation object related labels May 18, 2026
@dahlia dahlia added component/signatures OIP or HTTP/LD Signatures related component/otel OpenTelemetry integration labels May 18, 2026
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

@codex review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds bounded OpenTelemetry instruments and helpers for public-key lookups, remote JSON‑LD document fetches and cache hit/miss, and object/actor lookups; threads a MeterProvider through federation middleware and signature/key-fetch paths; extends kvCache and lookupObject to emit opt-in metrics; adds tests and docs.

Changes

OpenTelemetry metrics for federation

Layer / File(s) Summary
Metric taxonomy and FederationMetrics instrumentation
packages/fedify/src/federation/metrics.ts
Defines LookupKind/LookupResult and attribute interfaces; FederationMetrics gains counters/histograms and recording methods for key lookup, document fetch/duration, and document cache.
Error classification and document loader instrumentation
packages/fedify/src/federation/metrics.ts
classifyFetchError(error) maps thrown values to bounded DocumentFetchResult (+ optional status code). instrumentDocumentLoader(loader, options) wraps DocumentLoader calls to measure duration and record result/cache/host attributes, rethrowing errors unchanged.
Federation middleware loader factory instrumentation
packages/fedify/src/federation/middleware.ts
createFederation() wraps built-in and user-supplied documentLoaderFactory/contextLoaderFactory/authenticatedDocumentLoaderFactory with instrumentDocumentLoader, preserves prior fallback semantics, forwards per-call options, and threads meterProvider into lookupObject calls.
Key lookup instrumentation
packages/fedify/src/sig/key.ts
FetchKeyOptions adds optional meterProvider; refactors fetchKeyWithResult to measure lookup duration, classify outcome, and record activitypub.key.lookup/duration via recordKeyLookup in a finally block.
Signature verification meterProvider forwarding
packages/fedify/src/sig/http.ts
verifyRequestDraft() and verifyRequestRfc9421() now pass meterProvider into fetchKeyDetailed() so key lookup metrics are emitted during verification.
Cache metrics in kvCache
packages/fedify/src/utils/kv-cache.ts
KvCacheParameters now accepts optional meterProvider and kind; kvCache records activitypub.document.cache hit/miss metrics (with parsed remote host when possible) using recordDocumentCache.
Object lookup instrumentation
packages/vocab/src/lookup.ts
LookupObjectOptions gains optional meterProvider; lookupObject classifies resolved value as actor/object/other and records activitypub.object.lookup counter with activitypub.lookup.kind and optional activitypub.remote.host.
Tests & integration
packages/fedify/src/federation/metrics.test.ts, packages/fedify/src/federation/middleware.test.ts, packages/fedify/src/sig/key.test.ts, packages/fedify/src/sig/http.test.ts, packages/fedify/src/utils/kv-cache.test.ts, packages/vocab/src/lookup.test.ts
Adds extensive tests for metric recording, error classification, loader instrumentation behavior, federation factory wrapping, key lookup scenarios (hit/miss/404/network), cache hit/miss metrics, signature forwarding, and lookupObject counter behavior.
Documentation and changelog
docs/manual/opentelemetry.md, CHANGES.md
Documents new instruments, attribute taxonomy, bounded cardinality rules, and notes that document-loader–related metrics are emitted only when meterProvider is explicitly provided; adds changelog entry and missing footnote.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • #316 — The changes implement OpenTelemetry metrics for key/document/object lookups and caching, which closely match the objectives described in this issue.

Suggested labels

type/enhancement

Suggested reviewers

  • sij411
  • 2chanhaeng
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.06% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'OpenTelemetry: metrics for key lookup and remote document fetches' directly and clearly describes the main change—adding new OpenTelemetry metrics for key lookups and document fetches.
Description check ✅ Passed The description is comprehensive and well-related to the changeset, explaining the motivation, new instruments, attributes, implementation details, and testing plan.
Linked Issues check ✅ Passed All coding objectives from issue #738 are met: activitypub.key.lookup and duration metrics implemented, activitypub.document.fetch and duration metrics added, activitypub.document.cache counter included, activitypub.object.lookup counter added, bounded attribute taxonomies enforced via TypeScript types, full identifiers excluded from metrics, and comprehensive documentation and tests provided.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing OpenTelemetry metrics for key lookup and document fetches as specified in issue #738; no out-of-scope modifications were introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/vocab/src/lookup.test.ts`:
- Around line 857-858: Teardown currently calls fetchMock.removeRoutes() but
does not fully restore the global fetch after using fetchMock.spyGlobal();
update the test teardown to call fetchMock.hardReset() (in addition to
removeRoutes()) so the global fetch spy is removed and state is not leaked
across tests; locate the teardown where fetchMock.removeRoutes() is invoked and
append a call to fetchMock.hardReset().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f9497b6f-7cfb-448f-b2b2-f5b50d666c4d

📥 Commits

Reviewing files that changed from the base of the PR and between 6cc0266 and 3df79b5.

📒 Files selected for processing (14)
  • CHANGES.md
  • docs/manual/opentelemetry.md
  • packages/fedify/src/federation/metrics.test.ts
  • packages/fedify/src/federation/metrics.ts
  • packages/fedify/src/federation/middleware.test.ts
  • packages/fedify/src/federation/middleware.ts
  • packages/fedify/src/sig/http.test.ts
  • packages/fedify/src/sig/http.ts
  • packages/fedify/src/sig/key.test.ts
  • packages/fedify/src/sig/key.ts
  • packages/fedify/src/utils/kv-cache.test.ts
  • packages/fedify/src/utils/kv-cache.ts
  • packages/vocab/src/lookup.test.ts
  • packages/vocab/src/lookup.ts

Comment thread packages/vocab/src/lookup.test.ts Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive set of OpenTelemetry metrics to monitor ActivityPub operations, including public key lookups, remote JSON-LD document fetches, KV cache performance, and object classification. The implementation adds instrumentation to core loaders and signature verification paths while ensuring that document and context loader metrics remain opt-in to preserve reference identity for existing tests. The review feedback identifies a security improvement opportunity in the authenticated document loader to mitigate SSRF risks by strictly blocking private network access for indirect resource fetches, such as JSON-LD contexts, even when direct fetches are relaxed.

Comment thread packages/fedify/src/federation/middleware.ts
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 96.48241% with 14 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/vocab/src/lookup.ts 89.06% 3 Missing and 4 partials ⚠️
packages/fedify/src/federation/middleware.ts 93.65% 0 Missing and 4 partials ⚠️
packages/fedify/src/utils/kv-cache.ts 91.30% 1 Missing and 1 partial ⚠️
packages/fedify/src/sig/key.ts 98.38% 0 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
packages/fedify/src/federation/metrics.ts 98.89% <100.00%> (+0.67%) ⬆️
packages/fedify/src/sig/http.ts 79.97% <100.00%> (+0.02%) ⬆️
packages/fedify/src/sig/key.ts 88.42% <98.38%> (+1.45%) ⬆️
packages/fedify/src/utils/kv-cache.ts 85.71% <91.30%> (+1.09%) ⬆️
packages/fedify/src/federation/middleware.ts 96.26% <93.65%> (-0.05%) ⬇️
packages/vocab/src/lookup.ts 96.38% <89.06%> (-2.99%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

The new `lookupObject() records activitypub.object.lookup counter`
test block installs a `fetchMock.spyGlobal()` spy but previously only
called `fetchMock.removeRoutes()` in teardown, leaving the global
`fetch` spy installed after the block finished.  Call
`fetchMock.hardReset()` after `removeRoutes()` so the global `fetch`
is restored, matching the pattern the file's other `spyGlobal()`
blocks already use.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

/gemini review

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/vocab/src/lookup.test.ts`:
- Around line 701-703: The fetch-mock teardown is currently conditional and can
leak if a test fails; immediately register unconditional cleanup right after
calling fetchMock.spyGlobal() (e.g., via t.after(...) or by wrapping the test
body in try/finally) so that fetchMock.hardReset() / fetchMock.restore() always
run; update the locations where fetchMock.spyGlobal() is used (see the block
containing fetchMock.spyGlobal(), fetchMock.removeRoutes(), fetchMock.get(...)
and the similar occurrence around lines 857-858) to ensure teardown is
registered before any assertions or setup that might throw.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2de61636-5c5c-4900-a54f-0bee571b7869

📥 Commits

Reviewing files that changed from the base of the PR and between 3df79b5 and 1ddf69a.

📒 Files selected for processing (1)
  • packages/vocab/src/lookup.test.ts

Comment thread packages/vocab/src/lookup.test.ts Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive OpenTelemetry metrics for ActivityPub operations, including public key lookups, remote JSON-LD document fetches, KV-backed cache hits/misses, and lookupObject() classifications. It adds new counters and histograms to track latency and success rates while carefully excluding high-cardinality data to prevent cardinality explosion. The implementation includes a new instrumentDocumentLoader utility and updates to the core federation engine, signature verification, and vocabulary lookup logic. Feedback was provided regarding several violations of the repository's style guide in the documentation, specifically concerning the incorrect use of spaces around em dashes in narrative text.

Comment thread docs/manual/opentelemetry.md Outdated
Comment thread docs/manual/opentelemetry.md Outdated
Comment thread docs/manual/opentelemetry.md Outdated
Comment thread docs/manual/opentelemetry.md
Comment thread docs/manual/opentelemetry.md Outdated
Comment thread docs/manual/opentelemetry.md Outdated
Comment thread docs/manual/opentelemetry.md Outdated
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

dahlia added 2 commits May 18, 2026 21:50
The new `lookupObject() records activitypub.object.lookup counter`
test block calls `fetchMock.spyGlobal()` at the top and pairs it with
`fetchMock.removeRoutes(); fetchMock.hardReset();` at the bottom, but
the cleanup was unconditional only on the happy path: if any
`await t.step(...)` propagated a rejection, the global `fetch` spy
stayed installed and leaked into later tests.

Wrap the block body in `try { ... } finally { ... }` so the spy is
always torn down, even when an assertion inside a step fails.
`t.after()` was not used because Deno's `TestContext` does not expose
an `after` hook and the fixture wrapper is built around the
Deno-native context shape.

Existing test steps were re-indented one level deeper to fit inside
the try block; no test logic changed.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
Seven em-dashes in narrative paragraphs of the OpenTelemetry guide
used surrounding spaces, which the project style guide explicitly
forbids in running prose ("em dashes in narrative text should not
have surrounding spaces").  Gemini's code-assist bot flagged all
seven on the PR.

Rather than just removing the spaces around each em-dash, rewrite
the punctuation so the prose does not use em-dashes at all.  Em
dashes read as LLM-style in this codebase, and the replacements
(commas, semicolons, colons, parentheses, or a sentence split) all
read better in context.

No semantic content changed.

fedify-dev#771 (comment)
fedify-dev#771 (comment)
fedify-dev#771 (comment)
fedify-dev#771 (comment)
fedify-dev#771 (comment)
fedify-dev#771 (comment)
fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for public key lookups, remote JSON-LD document fetches, and object lookups to enhance observability of cache hits and fetch performance. New instruments like activitypub.key.lookup and activitypub.document.fetch track success rates and latencies across the federation layer. The implementation includes instrumentation for document loaders and key fetchers, along with updated documentation and extensive test coverage. Feedback suggests that the authenticated document loader factory should be modified to respect per-call options such as allowPrivateAddress and userAgent instead of relying solely on federation-level defaults.

Comment thread packages/fedify/src/federation/middleware.ts Outdated
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The built-in default `authenticatedDocumentLoaderFactory` previously
ignored its second `factoryOpts: DocumentLoaderFactoryOptions`
argument and always built the authenticated loader with the
constructor-level `allowPrivateAddress` / `userAgent`.  That was
inconsistent with the regular `documentLoaderFactory` and
`contextLoaderFactory` defaults a few lines up, which already use
`opts?.allowPrivateAddress ?? allowPrivateAddress` and
`opts?.userAgent ?? userAgent` to honor per-call overrides.

Mirror that pattern in the authenticated default so the constructor
defaults remain the fallback but per-call overrides now win.  The
user-supplied auth factory branch already forwards `factoryOpts`
directly via `userAuthFactory(identity, factoryOpts)` (added earlier
in this PR), so both branches now behave consistently.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive OpenTelemetry metrics for ActivityPub operations, including public key lookups, remote JSON-LD document fetches, and object classification within lookupObject(). It adds new counters and histograms to track latency, cache performance, and resolution results while maintaining strict cardinality bounds on attributes. The implementation includes an opt-in mechanism for loader metrics to preserve reference identity for existing tests. Documentation has been updated to reflect these changes, and extensive unit and integration tests have been added across the federation and vocabulary packages. I have no feedback to provide as there were no review comments to assess.

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 18, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/fedify/src/sig/key.test.ts Outdated
Comment thread packages/vocab/src/lookup.ts Outdated
dahlia added 2 commits May 20, 2026 01:13
The `activitypub.object.lookup` measurement's `kind` was assigned
after `lookupObjectInternal` returned, but only past several lines
that can throw, including `result.toJsonLd(options)` inside the
`span.addEvent("activitypub.object.fetched", ...)` payload.  If any
of those threw, `kind` stayed at its initial `"other"` value, so the
`finally` block recorded `kind=other` even when `lookupObjectInternal`
had returned a non-null Actor or Object.

Classify the result with `kind = isActor(result) ? "actor" : "object"`
immediately after `lookupObjectInternal` returns (before the span
attributes and span event work).  A thrown error past that point now
still records the correct `kind` value on the metric.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
The test was named "fetchKey() omits the meter provider has no
effect on behavior", which reads as if the subject is "fetchKey()
omits the meter provider" and the verb is "has no effect" (it is
not a complete clause).

Rename it to "fetchKey() works when meterProvider is omitted",
which matches the test's actual purpose: a sanity check that
omitting the new optional `meterProvider` keeps `fetchKey()`
functional.  No test logic changed.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 19, 2026

/gemini review

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 19, 2026

@codex review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive OpenTelemetry instrumentation for Fedify, adding new metrics for public key lookups, remote JSON-LD document fetches, document cache lookups, and object resolution. It includes the necessary metric definitions, helper functions for recording these metrics, and updates to the document loader and key lookup logic to emit these events. The changes are accompanied by extensive tests and updated documentation. The review comment regarding the exclusion of 'invalid' from 'DocumentFetchResult' is valid and should be addressed to ensure type safety.

Comment thread packages/fedify/src/federation/metrics.ts Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

`DocumentFetchResult` was previously defined as `Exclude<LookupResult,
"hit" | "miss">`, which still left `"invalid"` in the type union.  But
the document-loader instrumentation never produces `"invalid"`: the
success path always emits `"fetched"`, and `classifyFetchError()` only
returns `"not_found"`, `"network_error"`, or `"error"`.  `"invalid"`
is reserved for the key-lookup family, where the parser can decide
that a successful HTTP response still does not contain a usable key,
which the existing `activitypub.document.fetch` docs already call
out.

Narrow `DocumentFetchResult` to
`Exclude<LookupResult, "hit" | "miss" | "invalid">` so the unreachable
variant becomes a compile-time error if any future code path tries to
record `result: "invalid"` on the document-fetch family.  JSDoc
updated to mention the third exclusion and why.

No runtime behavior changes; this is a type-safety tightening.

fedify-dev#771 (comment)

Assisted-by: Claude Code:claude-opus-4-7
@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 19, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces OpenTelemetry metrics for public key lookups, remote JSON-LD document fetches, and object lookups—improving observability of cache hits and fetch latency. The implementation spans the federation middleware, signature verification, and KV cache utilities, ensuring high-cardinality identifiers are excluded to prevent metric inflation. I have no feedback to provide as no review comments were submitted.

@dahlia
Copy link
Copy Markdown
Member Author

dahlia commented May 19, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@dahlia dahlia merged commit 553b59b into fedify-dev:main May 20, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/federation Federation object related component/otel OpenTelemetry integration component/signatures OIP or HTTP/LD Signatures related

Development

Successfully merging this pull request may close these issues.

OpenTelemetry metrics for key lookup and remote document fetches

2 participants