diff --git a/CHANGES.md b/CHANGES.md index d3f47fe30..1a095e50e 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -151,6 +151,49 @@ To be released. `http.response.status_code` and a richer `activitypub.lookup.result` taxonomy. [[#316], [#738], [#771]] + - Added OpenTelemetry metrics for the WebFinger and actor handle + discovery paths so operators can graph aggregate discovery rate, + latency, and outcome mix without sampling spans: + + - `webfinger.lookup` (counter) and `webfinger.lookup.duration` + (histogram) cover outgoing `lookupWebFinger()` calls. + - `webfinger.handle` (counter) and `webfinger.handle.duration` + (histogram) cover incoming WebFinger requests handled by + `Federation.fetch()`. + - `activitypub.actor.discovery` (counter) and + `activitypub.actor.discovery.duration` (histogram) cover + `getActorHandle()` actor handle discovery. + + Each family carries a bounded result attribute + (`webfinger.lookup.result`, `webfinger.handle.result`, or + `activitypub.actor.discovery.result`) so operators can slice + discovery failures by terminal outcome (found / not\_found / + invalid / network\_error / error for outgoing lookups; + resolved / invalid / not\_found / tombstoned / error for incoming + requests; resolved / not\_found / error for actor discovery). + `webfinger.resource.scheme` is bucketed to a small allow list + (`acct`, `http`, `https`, `mailto`, or `other`) so an + attacker-controlled query string cannot inflate metric + cardinality; `activitypub.remote.host` records the URL hostname + only. Full resource URIs, lookup URLs, and handle strings are + deliberately excluded; they remain on the corresponding spans + (`webfinger.lookup`, `webfinger.handle`, + `activitypub.get_actor_handle`) for trace-level investigation. + + `lookupWebFinger()` and `getActorHandle()` follow the opt-in + `lookupObject()` pattern: omitting the new `meterProvider` option + emits no measurement. Applications that pass a `meterProvider` + to `createFederation()` get the inbound `webfinger.handle` family + and the federation-bound `Context.lookupWebFinger()` family wired + up automatically. Direct `getActorHandle()` calls remain opt-in: + pass `meterProvider` through `GetActorHandleOptions` to enable + the discovery metrics, and the option is forwarded into the + nested WebFinger lookups so one discovery emits both the + discovery measurement and the underlying `webfinger.lookup` + measurements (one for the actor ID host, plus a second for the + alias host when cross-origin verification runs). + [[#316], [#739], [#772]] + - Replaced Fedify's internal federation routing with *@fedify/uri-template* for stricter RFC 6570 URI Template expansion and matching. The deprecated `Router` export from *@fedify/fedify* remains @@ -163,6 +206,7 @@ To be released. [#736]: https://github.com/fedify-dev/fedify/issues/736 [#737]: https://github.com/fedify-dev/fedify/issues/737 [#738]: https://github.com/fedify-dev/fedify/issues/738 +[#739]: https://github.com/fedify-dev/fedify/issues/739 [#740]: https://github.com/fedify-dev/fedify/issues/740 [#742]: https://github.com/fedify-dev/fedify/issues/742 [#748]: https://github.com/fedify-dev/fedify/pull/748 @@ -175,6 +219,7 @@ To be released. [#769]: https://github.com/fedify-dev/fedify/pull/769 [#770]: https://github.com/fedify-dev/fedify/pull/770 [#771]: https://github.com/fedify-dev/fedify/pull/771 +[#772]: https://github.com/fedify-dev/fedify/pull/772 ### @fedify/fixture diff --git a/docs/manual/opentelemetry.md b/docs/manual/opentelemetry.md index ed66d95bc..34dc02489 100644 --- a/docs/manual/opentelemetry.md +++ b/docs/manual/opentelemetry.md @@ -334,6 +334,12 @@ Fedify records the following OpenTelemetry metrics: | `activitypub.document.fetch.duration` | Histogram | `ms` | Measures remote JSON-LD document loader invocation duration. | | `activitypub.document.cache` | Counter | `{lookup}` | Counts KV-backed document loader cache lookups, classified as `hit` or `miss`. | | `activitypub.object.lookup` | Counter | `{lookup}` | Counts `lookupObject()` calls, classified by whether the resolved value is an Actor. | +| `activitypub.actor.discovery` | Counter | `{discovery}` | Counts `getActorHandle()` actor handle discovery attempts. | +| `activitypub.actor.discovery.duration` | Histogram | `ms` | Measures `getActorHandle()` discovery duration. | +| `webfinger.lookup` | Counter | `{lookup}` | Counts outgoing WebFinger lookups performed by `lookupWebFinger()`. | +| `webfinger.lookup.duration` | Histogram | `ms` | Measures outgoing WebFinger lookup duration. | +| `webfinger.handle` | Counter | `{request}` | Counts inbound WebFinger requests handled by `Federation.fetch()`. | +| `webfinger.handle.duration` | Histogram | `ms` | Measures inbound WebFinger request handling duration. | | `fedify.http.server.request.count` | Counter | `{request}` | Counts inbound HTTP requests handled by `Federation.fetch()`. | | `fedify.http.server.request.duration` | Histogram | `ms` | Measures inbound HTTP request duration in `Federation.fetch()`. | | `fedify.queue.task.enqueued` | Counter | `{task}` | Counts inbox, outbox, and fanout tasks Fedify enqueued. | @@ -606,6 +612,99 @@ Fedify records the following OpenTelemetry metrics: classification and `activitypub.document.fetch[.duration]` for the loader-level rate and latency. +`activitypub.actor.discovery` and `activitypub.actor.discovery.duration` +: `activitypub.actor.discovery.result` is always present and is one of: + + - `resolved`: `getActorHandle()` returned a handle. + - `not_found`: WebFinger did not yield a usable `acct:` alias and + the `preferredUsername` fallback could not run (the call threw + the `Actor does not have enough information…` `TypeError`). + - `error`: any other thrown exception bubbled up from the + discovery (including `TypeError`s from a malformed alias URL or + an invalid `preferredUsername`). + + `activitypub.remote.host` records `actor.id.hostname` when known + and is omitted otherwise. Actor IDs and handle strings are + deliberately excluded so attacker-controlled actor data cannot + inflate metric cardinality. Per-WebFinger-call failure detail + (HTTP status, parse failure, network failure, etc.) lives on + [`webfinger.lookup`](#instrumented-metrics) and is not duplicated + here; the meter provider passed to `getActorHandle()` is also + forwarded to the nested WebFinger lookups, so one discovery emits + both an `activitypub.actor.discovery` measurement and one or two + `webfinger.lookup` measurements. When cross-origin actor handle + verification runs, the second lookup goes to a different host + than the first, so the two `webfinger.lookup` measurements may + record different `activitypub.remote.host` values. + +`webfinger.lookup` and `webfinger.lookup.duration` +: `webfinger.lookup.result` is always present and is one of: + + - `found`: a `ResourceDescriptor` was returned to the caller. + - `not_found`: the remote responded with HTTP `404 Not Found` or + `410 Gone`; recorded together with `http.response.status_code`. + - `invalid`: the remote responded with content Fedify could not + parse (JSON parse failure), the redirect chain exceeded + `maxRedirection`, the remote redirected to a different + protocol, the `Location` header itself was unparseable, or the + queried `acct:` resource was malformed. + - `network_error`: no HTTP response was observed. `fetch()` + threw, `validatePublicUrl()` rejected the URL (including + redirects to private addresses), or an `AbortError` cancelled + the request. + - `error`: the remote returned a non-2xx HTTP response that is + neither `404` nor `410`, or any other unexpected failure + bubbled up from the lookup. + + `webfinger.resource.scheme` is always present and bucketed to a + small allow-list (`acct`, `http`, `https`, `mailto`); resources + that carry any other scheme are recorded as `other` so that an + attacker-controlled remote cannot inflate cardinality by + redirecting to an unusual scheme. The corresponding span + attribute (`webfinger.resource.scheme` on the `webfinger.lookup` + span) still records the raw scheme for trace-level investigation. + `activitypub.remote.host` records the hostname of the latest URL + Fedify attempted, so an operator can see who actually returned a + failure even after one or more redirects; it is omitted only when + the resource itself was malformed before any URL could be built. + `http.response.status_code` is recorded only when an HTTP response + was observed (including non-2xx errors and redirects that exceeded + `maxRedirection`). Full resource URIs, lookup URLs, and remote + paths are deliberately excluded; they remain on the + `webfinger.lookup` span for trace-level investigation. + +`webfinger.handle` and `webfinger.handle.duration` +: `webfinger.handle.result` is always present and is one of: + + - `resolved`: Fedify returned a `200 OK` response with a JRD. + - `invalid`: Fedify returned `400 Bad Request` because the queried + `resource` parameter was missing or unparseable. + - `not_found`: Fedify returned `404 Not Found` because no actor + dispatcher matched the queried resource, the actor identifier + was not recognised, or the queried `acct:` host did not match + the server. + - `tombstoned`: Fedify returned `410 Gone` because the actor + dispatcher resolved to a `Tombstone`. + - `error`: the handler threw before producing a response, or a + custom `onNotFound` callback returned a status code outside + the `{200, 400, 404, 410}` set. + + `webfinger.resource.scheme` is bucketed to the same allow-list as + on `webfinger.lookup` (`acct`, `http`, `https`, `mailto`, or + `other`) and is omitted when the request had no `resource` + parameter. `http.response.status_code` is always recorded except + when the handler threw before constructing a response. The + queried resource string itself is deliberately not a metric + attribute (it is attacker-controlled); the full resource remains + on the `webfinger.handle` span for trace-level investigation. + These metrics complement + [`fedify.http.server.request.count`](#instrumented-metrics) and + [`fedify.http.server.request.duration`](#instrumented-metrics): + the HTTP metrics carry the bounded `fedify.endpoint=webfinger` + bucket, while these WebFinger-specific metrics expose + discovery-oriented outcome buckets (`tombstoned`, `not_found`, + etc.) and the queried scheme. + `fedify.http.server.request.count` and `fedify.http.server.request.duration` : `http.request.method` and `fedify.endpoint` are always present. `http.request.method` is normalized to one of the standard HTTP methods @@ -742,6 +841,7 @@ for ActivityPub: | `activitypub.delivery.attempt` | int | The zero-based delivery attempt number for a queued outgoing activity. | `0` | | `activitypub.delivery.permanent_failure` | boolean | Whether an outgoing delivery failure will be abandoned instead of retried. | `true` | | `activitypub.processing.result` | string | Lifecycle outcome of an inbox or outbox activity: `queued`, `processed`, `retried`, `rejected`, or `abandoned`. | `"retried"` | +| `activitypub.actor.discovery.result` | string | Terminal outcome of `getActorHandle()`: `resolved`, `not_found`, or `error`. | `"resolved"` | | `activitypub.actor.id` | string | The URI of the actor object. | `"https://example.com/actor/1"` | | `activitypub.actor.key.cached` | boolean | Whether the actor's public keys are cached. | `true` | | `activitypub.actor.type` | string[] | The qualified URI(s) of the actor type(s). | `["https://www.w3.org/ns/activitystreams#Person"]` | @@ -789,8 +889,10 @@ for ActivityPub: | `object_integrity_proofs.key_id` | string | The public key ID of the object integrity proof. | `"https://example.com/actor/1#main-key"` | | `object_integrity_proofs.signature` | string | The integrity proof of the object in hexadecimal. | `"73a74c990beabe6e59cc68f9c6db7811b59cbb22fd12dcffb3565b651540efe9"` | | `url.full` | string | The full URL being fetched by the document loader. | `"https://example.com/actor/1"` | +| `webfinger.handle.result` | string | Terminal outcome of an incoming WebFinger request: `resolved`, `invalid`, `not_found`, `tombstoned`, or `error`. | `"resolved"` | +| `webfinger.lookup.result` | string | Terminal outcome of an outgoing WebFinger lookup: `found`, `not_found`, `invalid`, `network_error`, or `error`. | `"found"` | | `webfinger.resource` | string | The queried resource URI. | `"acct:fedify@hollo.social"` | -| `webfinger.resource.scheme` | string | The scheme of the queried resource URI. | `"acct"` | +| `webfinger.resource.scheme` | string | The scheme of the queried resource URI. Metric attribute is bucketed to `acct`, `http`, `https`, `mailto`, or `other`. | `"acct"` | [attributes]: https://opentelemetry.io/docs/specs/otel/common/#attribute [OpenTelemetry Semantic Conventions]: https://opentelemetry.io/docs/specs/semconv/ diff --git a/packages/fedify/src/federation/metrics.test.ts b/packages/fedify/src/federation/metrics.test.ts index e98675209..1c89cd441 100644 --- a/packages/fedify/src/federation/metrics.test.ts +++ b/packages/fedify/src/federation/metrics.test.ts @@ -13,6 +13,7 @@ import { recordKeyLookup, recordOutboxActivity, recordOutboxEnqueue, + recordWebFingerHandle, } from "./metrics.ts"; const noopQueue: MessageQueue = { @@ -317,6 +318,78 @@ test("recordDocumentCache() records hit and miss as a counter", () => { ); }); +test("recordWebFingerHandle() records counter and duration with all attributes", () => { + const [meterProvider, recorder] = createTestMeterProvider(); + recordWebFingerHandle(meterProvider, { + durationMs: 17, + result: "resolved", + scheme: "acct", + statusCode: 200, + }); + + const counters = recorder.getMeasurements("webfinger.handle"); + assertEquals(counters.length, 1); + assertEquals(counters[0].type, "counter"); + assertEquals(counters[0].value, 1); + assertEquals(counters[0].attributes["webfinger.handle.result"], "resolved"); + assertEquals(counters[0].attributes["webfinger.resource.scheme"], "acct"); + assertEquals(counters[0].attributes["http.response.status_code"], 200); + + const durations = recorder.getMeasurements("webfinger.handle.duration"); + assertEquals(durations.length, 1); + assertEquals(durations[0].type, "histogram"); + assertEquals(durations[0].value, 17); + assertEquals(durations[0].attributes["webfinger.handle.result"], "resolved"); + assertEquals(durations[0].attributes["webfinger.resource.scheme"], "acct"); + assertEquals(durations[0].attributes["http.response.status_code"], 200); +}); + +test("recordWebFingerHandle() records each non-resolved result with the matching status code", () => { + const [meterProvider, recorder] = createTestMeterProvider(); + for ( + const [result, statusCode] of [ + ["invalid", 400], + ["not_found", 404], + ["tombstoned", 410], + ] as const + ) { + recordWebFingerHandle(meterProvider, { + durationMs: 1, + result, + scheme: "acct", + statusCode, + }); + } + const counters = recorder.getMeasurements("webfinger.handle"); + assertEquals(counters.length, 3); + assertEquals( + counters.map((m) => m.attributes["webfinger.handle.result"]), + ["invalid", "not_found", "tombstoned"], + ); + assertEquals( + counters.map((m) => m.attributes["http.response.status_code"]), + [400, 404, 410], + ); +}); + +test("recordWebFingerHandle() omits optional attributes when not provided", () => { + const [meterProvider, recorder] = createTestMeterProvider(); + recordWebFingerHandle(meterProvider, { + durationMs: 0, + result: "error", + }); + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals(counter?.attributes["webfinger.handle.result"], "error"); + assertEquals( + "webfinger.resource.scheme" in (counter?.attributes ?? {}), + false, + ); + assertEquals( + "http.response.status_code" in (counter?.attributes ?? {}), + false, + ); +}); + test("classifyFetchError() classifies FetchError with 404 as not_found", () => { const response = new Response("", { status: 404 }); const error = new FetchError( diff --git a/packages/fedify/src/federation/metrics.ts b/packages/fedify/src/federation/metrics.ts index 448c6d5fa..ff0c994fd 100644 --- a/packages/fedify/src/federation/metrics.ts +++ b/packages/fedify/src/federation/metrics.ts @@ -345,6 +345,70 @@ export interface SignatureVerificationExtraAttributes { failureReason?: HttpSignatureMetricFailureReason; } +/** + * The terminal classification of an incoming WebFinger request handled by + * Fedify, recorded as the `webfinger.handle.result` attribute on the + * `webfinger.handle` counter and `webfinger.handle.duration` histogram. + * + * - `resolved`: Fedify returned a `200 OK` response with a JRD. + * - `invalid`: Fedify returned `400 Bad Request` because the queried + * `resource` parameter was missing or unparseable. + * - `not_found`: Fedify returned `404 Not Found` because no actor + * dispatcher matched the queried resource, the actor identifier was + * not recognised, or the queried `acct:` host did not match the + * server. + * - `tombstoned`: Fedify returned `410 Gone` because the actor + * dispatcher resolved to a {@link import("@fedify/vocab").Tombstone}. + * - `error`: the handler threw before producing a response. + * @since 2.3.0 + */ +export type WebFingerHandleResult = + | "resolved" + | "invalid" + | "not_found" + | "tombstoned" + | "error"; + +/** + * The bounded value set recorded as `webfinger.resource.scheme` on the + * `webfinger.handle` counter and `webfinger.handle.duration` histogram. + * The set covers the schemes WebFinger / fediverse clients legitimately use + * (RFC 7565 + ActivityPub). Anything outside that set is bucketed as + * `other` at the call site so attacker-controlled query strings cannot + * inflate metric cardinality. + * @since 2.3.0 + */ +export type WebFingerResourceScheme = + | "acct" + | "http" + | "https" + | "mailto" + | "other"; + +/** + * Attributes accepted by {@link recordWebFingerHandle}. + * @since 2.3.0 + */ +export interface WebFingerHandleAttributes { + /** The terminal handling outcome. */ + result: WebFingerHandleResult; + /** Elapsed handler duration in milliseconds. */ + durationMs: number; + /** + * The scheme of the queried resource URI, restricted to the bounded + * {@link WebFingerResourceScheme} set so the metric attribute stays + * cardinality-safe. Omitted when Fedify could not extract a scheme + * (no resource parameter, unparseable URI, or thrown exception before + * parsing). + */ + scheme?: WebFingerResourceScheme; + /** + * The HTTP response status code Fedify produced. Omitted only when the + * handler threw before constructing a response. + */ + statusCode?: number; +} + class FederationMetrics { readonly deliverySent: Counter; readonly deliveryPermanentFailure: Counter; @@ -369,6 +433,8 @@ class FederationMetrics { readonly documentFetch: Counter; readonly documentFetchDuration: Histogram; readonly documentCache: Counter; + readonly webFingerHandle: Counter; + readonly webFingerHandleDuration: Histogram; constructor(meterProvider: MeterProvider) { const meter = meterProvider.getMeter(metadata.name, metadata.version); @@ -608,6 +674,41 @@ class FederationMetrics { "classification.", unit: "{lookup}", }); + this.webFingerHandle = meter.createCounter("webfinger.handle", { + description: + "Incoming WebFinger requests handled by Fedify, classified by " + + "terminal outcome.", + unit: "{request}", + }); + this.webFingerHandleDuration = meter.createHistogram( + "webfinger.handle.duration", + { + description: + "Duration of incoming WebFinger request handling in Fedify.", + unit: "ms", + advice: { + // Reuse the OpenTelemetry HTTP server semantic-conventions buckets + // since WebFinger requests follow the same 5 ms to 10 s range as + // the rest of Fedify's request path. + explicitBucketBoundaries: [ + 5, + 10, + 25, + 50, + 75, + 100, + 250, + 500, + 750, + 1000, + 2500, + 5000, + 7500, + 10000, + ], + }, + }, + ); } recordDelivery( @@ -820,6 +921,20 @@ class FederationMetrics { } this.documentCache.add(1, attributes); } + + recordWebFingerHandle(attrs: WebFingerHandleAttributes): void { + const attributes: Attributes = { + "webfinger.handle.result": attrs.result, + }; + if (attrs.scheme != null) { + attributes["webfinger.resource.scheme"] = attrs.scheme; + } + if (attrs.statusCode != null) { + attributes["http.response.status_code"] = attrs.statusCode; + } + this.webFingerHandle.add(1, attributes); + this.webFingerHandleDuration.record(attrs.durationMs, attributes); + } } function buildActivityLifecycleAttributes( @@ -1010,6 +1125,23 @@ export function recordDocumentCache( getFederationMetrics(meterProvider).recordDocumentCache(attrs); } +/** + * Records one measurement on `webfinger.handle` (counter) and + * `webfinger.handle.duration` (histogram) for an incoming WebFinger + * request handled by Fedify. Counter and histogram are always recorded + * together, with `webfinger.handle.result` set to one of `resolved`, + * `invalid`, `not_found`, `tombstoned`, or `error`. The queried + * resource string is deliberately excluded; it remains on the + * `webfinger.handle` span for trace-level investigation. + * @since 2.3.0 + */ +export function recordWebFingerHandle( + meterProvider: MeterProvider | undefined, + attrs: WebFingerHandleAttributes, +): void { + getFederationMetrics(meterProvider).recordWebFingerHandle(attrs); +} + /** * Classifies a thrown value from a key or document fetch into the bounded * {@link LookupResult} taxonomy and, when an HTTP response was received, diff --git a/packages/fedify/src/federation/middleware.ts b/packages/fedify/src/federation/middleware.ts index c546cf3d1..b664dc19e 100644 --- a/packages/fedify/src/federation/middleware.ts +++ b/packages/fedify/src/federation/middleware.ts @@ -1671,6 +1671,12 @@ export class FederationImpl webFingerLinksDispatcher: this.webFingerLinksDispatcher, onNotFound, tracer, + // Use the raw field, not the `meterProvider` getter, so an + // application that omits `meterProvider` in createFederation() + // does not get the WebFinger metrics enabled implicitly via + // the global meter provider fallback. The metric helpers are + // opt-in by design. + meterProvider: this._meterProvider, }); case "nodeInfoJrd": return await handleNodeInfoJrd(request, context); @@ -2552,6 +2558,12 @@ export class ContextImpl implements Context { ...options, userAgent: options.userAgent ?? this.federation.userAgent, tracerProvider: options.tracerProvider ?? this.tracerProvider, + // Default from the federation's raw field, not the + // `meterProvider` getter, so omitting `meterProvider` from + // createFederation() does not implicitly enable the + // `webfinger.lookup` metric through the global meter provider + // fallback. The metric helper is opt-in by design. + meterProvider: options.meterProvider ?? this.federation._meterProvider, allowPrivateAddress: this.federation.allowPrivateAddress, }); } diff --git a/packages/fedify/src/federation/webfinger.test.ts b/packages/fedify/src/federation/webfinger.test.ts index 8e8878c50..554a5e666 100644 --- a/packages/fedify/src/federation/webfinger.test.ts +++ b/packages/fedify/src/federation/webfinger.test.ts @@ -1,7 +1,7 @@ -import { test } from "@fedify/fixture"; +import { createTestMeterProvider, test } from "@fedify/fixture"; import type { Actor } from "@fedify/vocab"; import { Image, Link, Person, Tombstone } from "@fedify/vocab"; -import { assertEquals } from "@std/assert"; +import { assertEquals, assertNotEquals } from "@std/assert"; import type { ActorAliasMapper, ActorDispatcher, @@ -613,3 +613,197 @@ test("handleWebFinger()", async (t) => { assertEquals(result, expectedWithCustomLinks); }); }); + +test("handleWebFinger() records webfinger.handle counter and duration", async (t) => { + const url = new URL("https://example.com/.well-known/webfinger"); + const actorDispatcher: ActorDispatcher = (ctx, identifier) => { + if (identifier === "gone") { + return new Tombstone({ id: ctx.getActorUri(identifier) }); + } + if (identifier !== "someone") return null; + return new Person({ + id: ctx.getActorUri(identifier), + preferredUsername: "someone", + }); + }; + const onNotFound = () => new Response("Not found", { status: 404 }); + + function createContext(u: URL): RequestContext { + const federation = createFederation({ kv: new MemoryKvStore() }); + return createRequestContext({ + federation, + url: u, + data: undefined, + getActorUri(identifier) { + return new URL(`${u.origin}/users/${identifier}`); + }, + async getActor(handle) { + const actor = await actorDispatcher( + this as RequestContext, + handle, + ); + return actor instanceof Tombstone ? null : actor; + }, + parseUri(uri) { + if (uri == null) return null; + if (uri.protocol === "acct:") return null; + if (!uri.pathname.startsWith("/users/")) return null; + const identifier = uri.pathname.split("/").pop()!; + return { type: "actor", identifier }; + }, + }); + } + + await t.step("records result=resolved for a 200 response", async () => { + const u = new URL(url); + u.searchParams.set("resource", "acct:someone@example.com"); + const context = createContext(u); + const [meterProvider, recorder] = createTestMeterProvider(); + const response = await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + meterProvider, + }); + assertEquals(response.status, 200); + + const counter = recorder.getMeasurement("webfinger.handle"); + assertNotEquals(counter, undefined); + assertEquals(counter?.type, "counter"); + assertEquals(counter?.value, 1); + assertEquals(counter?.attributes["webfinger.handle.result"], "resolved"); + assertEquals(counter?.attributes["webfinger.resource.scheme"], "acct"); + assertEquals(counter?.attributes["http.response.status_code"], 200); + + const duration = recorder.getMeasurement("webfinger.handle.duration"); + assertNotEquals(duration, undefined); + assertEquals(duration?.type, "histogram"); + assertEquals(duration?.attributes["webfinger.handle.result"], "resolved"); + }); + + await t.step("records result=invalid for a 400 response", async () => { + const context = createContext(new URL(url)); + const [meterProvider, recorder] = createTestMeterProvider(); + const response = await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + meterProvider, + }); + assertEquals(response.status, 400); + + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals(counter?.attributes["webfinger.handle.result"], "invalid"); + assertEquals(counter?.attributes["http.response.status_code"], 400); + assertEquals( + "webfinger.resource.scheme" in (counter?.attributes ?? {}), + false, + "missing resource has no scheme attribute", + ); + }); + + await t.step("records result=not_found for a 404 response", async () => { + const u = new URL(url); + u.searchParams.set("resource", "acct:absent@example.com"); + const context = createContext(u); + const [meterProvider, recorder] = createTestMeterProvider(); + const response = await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + meterProvider, + }); + assertEquals(response.status, 404); + + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals(counter?.attributes["webfinger.handle.result"], "not_found"); + assertEquals(counter?.attributes["http.response.status_code"], 404); + assertEquals(counter?.attributes["webfinger.resource.scheme"], "acct"); + }); + + await t.step("records result=tombstoned for a 410 response", async () => { + const u = new URL(url); + u.searchParams.set("resource", "acct:gone@example.com"); + const context = createContext(u); + const [meterProvider, recorder] = createTestMeterProvider(); + const response = await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + meterProvider, + }); + assertEquals(response.status, 410); + + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals(counter?.attributes["webfinger.handle.result"], "tombstoned"); + assertEquals(counter?.attributes["http.response.status_code"], 410); + }); + + await t.step( + "records result=not_found when onNotFound returns 200", + async () => { + // A user-provided `onNotFound` callback can legally return any + // status code, including 200 with a fallback page. The metric + // must still classify the request as `not_found` because the + // lookup did not actually resolve an actor. + const u = new URL(url); + u.searchParams.set("resource", "acct:absent@example.com"); + const context = createContext(u); + const [meterProvider, recorder] = createTestMeterProvider(); + const response = await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound: () => new Response("custom fallback", { status: 200 }), + meterProvider, + }); + assertEquals(response.status, 200); + + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals( + counter?.attributes["webfinger.handle.result"], + "not_found", + ); + assertEquals(counter?.attributes["http.response.status_code"], 200); + }, + ); + + await t.step( + "buckets unknown resource schemes as 'other' to keep metric cardinality bounded", + async () => { + const u = new URL(url); + // An attacker-controlled `resource` value with an unusual scheme + // must not inflate the `webfinger.resource.scheme` attribute set. + u.searchParams.set("resource", "ssh:nobody@example.com"); + const context = createContext(u); + const [meterProvider, recorder] = createTestMeterProvider(); + await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + meterProvider, + }); + const counter = recorder.getMeasurement("webfinger.handle"); + assertEquals(counter?.attributes["webfinger.resource.scheme"], "other"); + }, + ); + + await t.step( + "omits measurements when no meterProvider is provided", + async () => { + const u = new URL(url); + u.searchParams.set("resource", "acct:someone@example.com"); + const context = createContext(u); + const [_unused, recorder] = createTestMeterProvider(); + await handleWebFinger(context.request, { + context, + actorDispatcher, + onNotFound, + }); + assertEquals(recorder.getMeasurements("webfinger.handle").length, 0); + assertEquals( + recorder.getMeasurements("webfinger.handle.duration").length, + 0, + ); + }, + ); +}); diff --git a/packages/fedify/src/federation/webfinger.ts b/packages/fedify/src/federation/webfinger.ts index f1536bad2..4c46c1c60 100644 --- a/packages/fedify/src/federation/webfinger.ts +++ b/packages/fedify/src/federation/webfinger.ts @@ -1,7 +1,7 @@ import { Link as LinkObject, Tombstone } from "@fedify/vocab"; import type { Link, ResourceDescriptor } from "@fedify/webfinger"; import { getLogger } from "@logtape/logtape"; -import type { Span, Tracer } from "@opentelemetry/api"; +import type { MeterProvider, Span, Tracer } from "@opentelemetry/api"; import { SpanKind, SpanStatusCode } from "@opentelemetry/api"; import { domainToASCII } from "node:url"; import type { @@ -11,6 +11,11 @@ import type { WebFingerLinksDispatcher, } from "./callback.ts"; import type { RequestContext } from "./context.ts"; +import { + recordWebFingerHandle, + type WebFingerHandleResult, + type WebFingerResourceScheme, +} from "./metrics.ts"; const logger = getLogger(["fedify", "webfinger", "server"]); @@ -69,6 +74,16 @@ export interface WebFingerHandlerParameters { * @since 1.3.0 */ span?: Span; + + /** + * The OpenTelemetry meter provider used to record the `webfinger.handle` + * counter and `webfinger.handle.duration` histogram. When omitted, no + * WebFinger-specific measurements are emitted (the request still + * contributes to `fedify.http.server.request.*` because that metric is + * recorded one layer up in `Federation.fetch`). + * @since 2.3.0 + */ + meterProvider?: MeterProvider; } /** @@ -82,30 +97,116 @@ export async function handleWebFinger( request: Request, options: WebFingerHandlerParameters, ): Promise { - if (options.tracer == null) { - return await handleWebFingerInternal(request, options); - } - return await options.tracer.startActiveSpan( - "webfinger.handle", - { kind: SpanKind.SERVER }, - async (span) => { - try { - const response = await handleWebFingerInternal(request, options); - span.setStatus({ - code: response.ok ? SpanStatusCode.UNSET : SpanStatusCode.ERROR, - }); - return response; - } catch (error) { - span.setStatus({ - code: SpanStatusCode.ERROR, - message: String(error), - }); - throw error; - } finally { - span.end(); - } + const meterProvider = options.meterProvider; + const start = meterProvider == null ? 0 : performance.now(); + const resource = options.context.url.searchParams.get("resource"); + const scheme = computeResourceScheme(resource); + // Track whether the response was produced by the caller's `onNotFound` + // callback so the `webfinger.handle.result` attribute classifies as + // `not_found` regardless of the status code that callback returned. + // Frameworks routinely answer 404s with a 200 OK fallback page; without + // this signal, every such response would skew the metric to `resolved`. + let notFoundResponse: Response | undefined; + const wrappedOptions: WebFingerHandlerParameters = { + ...options, + async onNotFound(req) { + const r = await options.onNotFound(req); + notFoundResponse = r; + return r; }, - ); + }; + let response: Response | undefined; + try { + if (options.tracer == null) { + response = await handleWebFingerInternal(request, wrappedOptions); + } else { + response = await options.tracer.startActiveSpan( + "webfinger.handle", + { kind: SpanKind.SERVER }, + async (span) => { + try { + const inner = await handleWebFingerInternal( + request, + wrappedOptions, + ); + span.setStatus({ + code: inner.ok ? SpanStatusCode.UNSET : SpanStatusCode.ERROR, + }); + return inner; + } catch (error) { + span.setStatus({ + code: SpanStatusCode.ERROR, + message: String(error), + }); + throw error; + } finally { + span.end(); + } + }, + ); + } + return response; + } finally { + if (meterProvider != null) { + recordWebFingerHandle(meterProvider, { + durationMs: Math.max(0, performance.now() - start), + result: classifyWebFingerHandleResult(response, notFoundResponse), + scheme, + statusCode: response?.status, + }); + } + } +} + +// The scheme attribute is recorded on the `webfinger.handle` metric, whose +// resource value comes from an attacker-controlled query string. Buckets are +// whitelisted to the schemes WebFinger / fediverse clients legitimately use +// (RFC 7565 + ActivityPub), with anything else bucketed as `other`. This +// keeps metric cardinality bounded even when a client probes Fedify with +// arbitrary, very long, or control-character-bearing prefixes. +const WEBFINGER_HANDLE_SCHEME_WHITELIST: ReadonlySet< + Exclude +> = new Set(["acct", "http", "https", "mailto"]); + +function isAllowedResourceScheme( + scheme: string, +): scheme is Exclude { + return (WEBFINGER_HANDLE_SCHEME_WHITELIST as ReadonlySet).has(scheme); +} + +function computeResourceScheme( + resource: string | null, +): WebFingerResourceScheme | undefined { + if (resource == null) return undefined; + const colon = resource.indexOf(":"); + if (colon <= 0) return undefined; + const candidate = resource.substring(0, colon).toLowerCase(); + return isAllowedResourceScheme(candidate) ? candidate : "other"; +} + +function classifyWebFingerHandleResult( + response: Response | undefined, + notFoundResponse: Response | undefined, +): WebFingerHandleResult { + if (response == null) return "error"; + // When the response was produced by the caller's `onNotFound`, the + // outcome is `not_found` regardless of the status code the callback + // chose (frameworks may return 200 with a fallback page). + if (notFoundResponse != null && response === notFoundResponse) { + return "not_found"; + } + switch (response.status) { + case 200: + return "resolved"; + case 400: + return "invalid"; + case 404: + return "not_found"; + case 410: + return "tombstoned"; + default: + return "error"; + } } async function handleWebFingerInternal( diff --git a/packages/vocab/src/actor.test.ts b/packages/vocab/src/actor.test.ts index f35e2b6cd..ef986637a 100644 --- a/packages/vocab/src/actor.test.ts +++ b/packages/vocab/src/actor.test.ts @@ -1,4 +1,4 @@ -import { test } from "@fedify/fixture"; +import { createTestMeterProvider, test } from "@fedify/fixture"; import * as fc from "fast-check"; import fetchMock from "fetch-mock"; import { @@ -192,6 +192,202 @@ test({ }, }); +test("getActorHandle() records activitypub.actor.discovery counter", { + permissions: { env: true, read: true }, +}, async (t) => { + fetchMock.spyGlobal(); + try { + const actorId = new URL("https://foo.example.com/@john"); + const actor = new Person({ + id: actorId, + preferredUsername: "john", + }); + + await t.step("records result=resolved on a successful lookup", async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { + body: { subject: "acct:johndoe@foo.example.com" }, + headers: { "Content-Type": "application/jrd+json" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + const handle = await getActorHandle(actor, { meterProvider }); + deepStrictEqual(handle, "@johndoe@foo.example.com"); + + const counters = recorder.getMeasurements( + "activitypub.actor.discovery", + ); + deepStrictEqual(counters.length, 1); + deepStrictEqual(counters[0].type, "counter"); + deepStrictEqual(counters[0].value, 1); + deepStrictEqual( + counters[0].attributes["activitypub.actor.discovery.result"], + "resolved", + ); + deepStrictEqual( + counters[0].attributes["activitypub.remote.host"], + "foo.example.com", + ); + + const durations = recorder.getMeasurements( + "activitypub.actor.discovery.duration", + ); + deepStrictEqual(durations.length, 1); + deepStrictEqual(durations[0].type, "histogram"); + deepStrictEqual( + durations[0].attributes["activitypub.actor.discovery.result"], + "resolved", + ); + ok(typeof durations[0].value === "number" && durations[0].value >= 0); + }); + + await t.step( + "records result=resolved when WebFinger is missing but preferredUsername fallback succeeds", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { status: 404 }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + const handle = await getActorHandle(actor, { meterProvider }); + deepStrictEqual(handle, "@john@foo.example.com"); + const counter = recorder.getMeasurement("activitypub.actor.discovery"); + ok(counter != null); + deepStrictEqual( + counter.attributes["activitypub.actor.discovery.result"], + "resolved", + ); + }, + ); + + await t.step( + "records result=not_found when neither WebFinger nor preferredUsername yields a handle", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { status: 404 }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await rejects( + () => getActorHandle(actorId, { meterProvider }), + TypeError, + ); + const counter = recorder.getMeasurement("activitypub.actor.discovery"); + ok(counter != null); + deepStrictEqual( + counter.attributes["activitypub.actor.discovery.result"], + "not_found", + ); + deepStrictEqual( + counter.attributes["activitypub.remote.host"], + "foo.example.com", + ); + const duration = recorder.getMeasurement( + "activitypub.actor.discovery.duration", + ); + ok(duration != null); + deepStrictEqual( + duration.attributes["activitypub.actor.discovery.result"], + "not_found", + ); + }, + ); + + await t.step( + "records result=error when a malformed WebFinger alias throws TypeError", + async () => { + // The "[" byte makes `new URL("https://[/")` throw `TypeError` + // when getActorHandleInternal attempts to parse the alias host. + // This TypeError is a malformed-remote-data failure, not the + // "actor lacks information" sentinel, so the metric must record + // `error` rather than `not_found`. + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { + body: { + subject: "https://foo.example.com/@john", + aliases: ["acct:john@["], + }, + headers: { "Content-Type": "application/jrd+json" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await rejects( + () => getActorHandle(actorId, { meterProvider }), + TypeError, + ); + const counter = recorder.getMeasurement( + "activitypub.actor.discovery", + ); + ok(counter != null); + deepStrictEqual( + counter.attributes["activitypub.actor.discovery.result"], + "error", + ); + }, + ); + + await t.step( + "propagates meterProvider into the nested webfinger.lookup", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { + body: { subject: "acct:johndoe@foo.example.com" }, + headers: { "Content-Type": "application/jrd+json" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await getActorHandle(actor, { meterProvider }); + const webFingerCounter = recorder.getMeasurement("webfinger.lookup"); + ok(webFingerCounter != null); + deepStrictEqual( + webFingerCounter.attributes["webfinger.lookup.result"], + "found", + ); + }, + ); + + await t.step( + "omits measurements when no meterProvider is provided", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://foo.example.com/.well-known/webfinger?", + { + body: { subject: "acct:johndoe@foo.example.com" }, + headers: { "Content-Type": "application/jrd+json" }, + }, + ); + const [_unused, recorder] = createTestMeterProvider(); + await getActorHandle(actor); + deepStrictEqual( + recorder.getMeasurements("activitypub.actor.discovery").length, + 0, + ); + deepStrictEqual( + recorder.getMeasurements("activitypub.actor.discovery.duration") + .length, + 0, + ); + deepStrictEqual( + recorder.getMeasurements("webfinger.lookup").length, + 0, + ); + }, + ); + } finally { + fetchMock.removeRoutes(); + fetchMock.hardReset(); + } +}); + test("normalizeActorHandle()", () => { deepStrictEqual(normalizeActorHandle("@foo@BAR.COM"), "@foo@bar.com"); deepStrictEqual(normalizeActorHandle("@BAZ@☃-⌘.com"), "@BAZ@☃-⌘.com"); diff --git a/packages/vocab/src/actor.ts b/packages/vocab/src/actor.ts index c820ccb53..0bc9ad4f4 100644 --- a/packages/vocab/src/actor.ts +++ b/packages/vocab/src/actor.ts @@ -1,11 +1,114 @@ import type { GetUserAgentOptions } from "@fedify/vocab-runtime"; import { lookupWebFinger } from "@fedify/webfinger"; -import { SpanStatusCode, trace, type TracerProvider } from "@opentelemetry/api"; +import { + type Attributes, + type Counter, + type Histogram, + type MeterProvider, + SpanStatusCode, + trace, + type TracerProvider, +} from "@opentelemetry/api"; import { domainToASCII, domainToUnicode } from "node:url"; import metadata from "../deno.json" with { type: "json" }; import { getTypeId } from "./type.ts"; import { Application, Group, Organization, Person, Service } from "./vocab.ts"; +/** + * The terminal classification of a {@link getActorHandle} call, recorded as + * the `activitypub.actor.discovery.result` attribute on the + * `activitypub.actor.discovery` counter and + * `activitypub.actor.discovery.duration` histogram. + * + * - `resolved`: a handle was returned to the caller. + * - `not_found`: WebFinger did not yield a usable `acct:` alias and the + * `preferredUsername` fallback could not run. This corresponds to the + * `TypeError("Actor does not have enough information…")` path. + * - `error`: any other thrown exception bubbled up from the lookup. + * + * Per-WebFinger-call failure detail (HTTP status, parse failure, network + * failure, etc.) lives on `webfinger.lookup` and is intentionally not + * duplicated here. + * @since 2.3.0 + */ +export type ActorDiscoveryResult = "resolved" | "not_found" | "error"; + +interface ActorDiscoveryInstruments { + discovery: Counter; + discoveryDuration: Histogram; +} + +const ACTOR_DISCOVERY_HISTOGRAM_BUCKETS: ReadonlyArray = [ + 5, + 10, + 25, + 50, + 75, + 100, + 250, + 500, + 750, + 1000, + 2500, + 5000, + 7500, + 10000, +]; + +const actorDiscoveryInstruments = new WeakMap< + MeterProvider, + ActorDiscoveryInstruments +>(); + +function getActorDiscoveryInstruments( + meterProvider: MeterProvider, +): ActorDiscoveryInstruments { + let instruments = actorDiscoveryInstruments.get(meterProvider); + if (instruments == null) { + const meter = meterProvider.getMeter(metadata.name, metadata.version); + instruments = { + discovery: meter.createCounter("activitypub.actor.discovery", { + description: "Actor handle discovery attempts via getActorHandle(), " + + "classified by terminal outcome.", + unit: "{discovery}", + }), + discoveryDuration: meter.createHistogram( + "activitypub.actor.discovery.duration", + { + description: "Duration of getActorHandle() actor discovery calls.", + unit: "ms", + advice: { + explicitBucketBoundaries: [...ACTOR_DISCOVERY_HISTOGRAM_BUCKETS], + }, + }, + ), + }; + actorDiscoveryInstruments.set(meterProvider, instruments); + } + return instruments; +} + +function getActorDiscoveryRemoteHost( + actor: Actor | URL, +): string | undefined { + const id = actor instanceof URL ? actor : actor.id; + if (id == null) return undefined; + return id.hostname === "" ? undefined : id.hostname; +} + +// Subclass of TypeError that preserves the documented `throws {TypeError}` +// API contract while letting the actor-discovery metric distinguish the +// "actor lacks enough information to derive a handle" terminal path from +// other TypeError-shaped failures that can come from malformed remote data +// (e.g. an alias that does not parse as a URL, or an invalid preferred +// username that breaks normalizeActorHandle). +class ActorHandleNotFoundError extends TypeError { + constructor() { + super("Actor does not have enough information to get the handle."); + this.name = "ActorHandleNotFoundError"; + } +} + /** * Actor types are {@link Object} types that are capable of performing * activities. @@ -101,6 +204,19 @@ export interface GetActorHandleOptions extends NormalizeActorHandleOptions { * @since 1.3.0 */ tracerProvider?: TracerProvider; + + /** + * The OpenTelemetry meter provider used to record the + * `activitypub.actor.discovery` counter and + * `activitypub.actor.discovery.duration` histogram. When set, the same + * meter provider is also forwarded to the nested WebFinger lookups so + * each discovery emits both the actor-discovery measurements and the + * underlying `webfinger.lookup` measurements. If omitted, no + * measurements are emitted (the helper is opt-in to avoid touching the + * global meter provider for callers that do not use OpenTelemetry). + * @since 2.3.0 + */ + meterProvider?: MeterProvider; } /** @@ -145,15 +261,36 @@ export async function getActorHandle( } span.setAttribute("activitypub.actor.type", getTypeId(actor).href); } + const meterProvider = options.meterProvider; + const start = meterProvider == null ? 0 : performance.now(); + let result: ActorDiscoveryResult = "error"; try { - return await getActorHandleInternal(actor, options); + const handle = await getActorHandleInternal(actor, options); + result = "resolved"; + return handle; } catch (error) { + result = error instanceof ActorHandleNotFoundError + ? "not_found" + : "error"; span.setStatus({ code: SpanStatusCode.ERROR, message: String(error), }); throw error; } finally { + if (meterProvider != null) { + const attributes: Attributes = { + "activitypub.actor.discovery.result": result, + }; + const host = getActorDiscoveryRemoteHost(actor); + if (host != null) attributes["activitypub.remote.host"] = host; + const instruments = getActorDiscoveryInstruments(meterProvider); + instruments.discovery.add(1, attributes); + instruments.discoveryDuration.record( + Math.max(0, performance.now() - start), + attributes, + ); + } span.end(); } }, @@ -169,6 +306,7 @@ async function getActorHandleInternal( const result = await lookupWebFinger(actorId, { userAgent: options.userAgent, tracerProvider: options.tracerProvider, + meterProvider: options.meterProvider, }); if (result != null) { const aliases = [...(result.aliases ?? [])]; @@ -184,6 +322,7 @@ async function getActorHandleInternal( alias, options.userAgent, options.tracerProvider, + options.meterProvider, ) ) { continue; @@ -202,9 +341,7 @@ async function getActorHandleInternal( options, ); } - throw new TypeError( - "Actor does not have enough information to get the handle.", - ); + throw new ActorHandleNotFoundError(); } async function verifyCrossOriginActorHandle( @@ -212,8 +349,13 @@ async function verifyCrossOriginActorHandle( alias: string, userAgent: GetUserAgentOptions | string | undefined, tracerProvider: TracerProvider | undefined, + meterProvider: MeterProvider | undefined, ): Promise { - const response = await lookupWebFinger(alias, { userAgent, tracerProvider }); + const response = await lookupWebFinger(alias, { + userAgent, + tracerProvider, + meterProvider, + }); if (response == null) return false; for (const alias of response.aliases ?? []) { if (new URL(alias).href === actorId) return true; diff --git a/packages/vocab/src/lookup.test.ts b/packages/vocab/src/lookup.test.ts index 807da59e0..1acf4624c 100644 --- a/packages/vocab/src/lookup.test.ts +++ b/packages/vocab/src/lookup.test.ts @@ -811,6 +811,42 @@ test("lookupObject() records activitypub.object.lookup counter", { }, ); + await t.step( + "propagates meterProvider into the nested webfinger.lookup", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + { + subject: "acct:johndoe@example.com", + links: [ + { + rel: "self", + href: "https://example.com/person", + type: "application/activity+json", + }, + ], + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupObject("@johndoe@example.com", { + documentLoader: mockDocumentLoader, + contextLoader: mockDocumentLoader, + meterProvider, + }); + const webFingerCounter = recorder.getMeasurement("webfinger.lookup"); + ok(webFingerCounter != null); + deepStrictEqual( + webFingerCounter.attributes["webfinger.lookup.result"], + "found", + ); + deepStrictEqual( + webFingerCounter.attributes["activitypub.remote.host"], + "example.com", + ); + }, + ); + await t.step( "extracts host from a URL acct: instance", async () => { diff --git a/packages/vocab/src/lookup.ts b/packages/vocab/src/lookup.ts index da9159c42..e648eff02 100644 --- a/packages/vocab/src/lookup.ts +++ b/packages/vocab/src/lookup.ts @@ -286,6 +286,7 @@ async function lookupObjectInternal( const jrd = await lookupWebFinger(identifier, { userAgent: options.userAgent, tracerProvider: options.tracerProvider, + meterProvider: options.meterProvider, allowPrivateAddress: "allowPrivateAddress" in options && options.allowPrivateAddress === true, signal: options.signal, diff --git a/packages/webfinger/src/lookup.test.ts b/packages/webfinger/src/lookup.test.ts index 33e93ffc3..21cdfe59f 100644 --- a/packages/webfinger/src/lookup.test.ts +++ b/packages/webfinger/src/lookup.test.ts @@ -1,7 +1,7 @@ -import { test } from "@fedify/fixture"; +import { createTestMeterProvider, test } from "@fedify/fixture"; import { withTimeout } from "es-toolkit"; import fetchMock from "fetch-mock"; -import { deepStrictEqual } from "node:assert/strict"; +import { deepStrictEqual, ok } from "node:assert/strict"; import type { ResourceDescriptor } from "./jrd.ts"; import { lookupWebFinger } from "./lookup.ts"; @@ -15,6 +15,21 @@ test({ deepStrictEqual(await lookupWebFinger(new URL("acct:johndoe")), null); deepStrictEqual(await lookupWebFinger("acct:johndoe@"), null); deepStrictEqual(await lookupWebFinger(new URL("acct:johndoe@")), null); + // Per RFC 7565, the acct: authority is bare `host`: no path, + // query, or fragment is allowed. Reject such inputs rather than + // forwarding them to a remote WebFinger lookup. + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com/exploit"), + null, + ); + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com?x=1"), + null, + ); + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com#frag"), + null, + ); }); await t.step("connection refused", async () => { @@ -29,81 +44,117 @@ test({ }); fetchMock.spyGlobal(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { status: 404 }, - ); + // Wrap the rest of the outer test in try/finally so the global + // `fetch` spy is torn down even if a t.step assertion below + // throws. Matches the cleanup pattern of the metrics test + // further down in this file. + try { + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { status: 404 }, + ); - await t.step("not found", async () => { - deepStrictEqual(await lookupWebFinger("acct:johndoe@example.com"), null); - deepStrictEqual(await lookupWebFinger("https://example.com/foo"), null); - }); + await t.step("not found", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + null, + ); + deepStrictEqual(await lookupWebFinger("https://example.com/foo"), null); + }); - const expected: ResourceDescriptor = { - subject: "acct:johndoe@example.com", - links: [], - }; - fetchMock.removeRoutes(); - fetchMock.get( - "https://example.com/.well-known/webfinger?resource=acct%3Ajohndoe%40example.com", - { body: expected }, - ); + const expected: ResourceDescriptor = { + subject: "acct:johndoe@example.com", + links: [], + }; + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=acct%3Ajohndoe%40example.com", + { body: expected }, + ); - await t.step("acct", async () => { - deepStrictEqual( - await lookupWebFinger("acct:johndoe@example.com"), - expected, + await t.step("acct", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + expected, + ); + }); + + const expected2: ResourceDescriptor = { + subject: "https://example.com/foo", + links: [], + }; + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=https%3A%2F%2Fexample.com%2Ffoo", + { body: expected2 }, ); - }); - const expected2: ResourceDescriptor = { - subject: "https://example.com/foo", - links: [], - }; - fetchMock.removeRoutes(); - fetchMock.get( - "https://example.com/.well-known/webfinger?resource=https%3A%2F%2Fexample.com%2Ffoo", - { body: expected2 }, - ); + await t.step("https", async () => { + deepStrictEqual( + await lookupWebFinger("https://example.com/foo"), + expected2, + ); + }); - await t.step("https", async () => { - deepStrictEqual( - await lookupWebFinger("https://example.com/foo"), - expected2, + const mailtoExpected: ResourceDescriptor = { + subject: "mailto:juliet@example.com", + links: [], + }; + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=mailto%3Ajuliet%40example.com", + { body: mailtoExpected }, ); - }); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { body: "not json" }, - ); + await t.step("mailto", async () => { + // RFC 7033 permits any URI as a WebFinger resource, and RFC 7565 + // explicitly references `mailto:` as an example. The opaque-path + // host extraction (after the last `@`) applies to `mailto:` just + // like `acct:`. + deepStrictEqual( + await lookupWebFinger("mailto:juliet@example.com"), + mailtoExpected, + ); + }); - await t.step("invalid response", async () => { - deepStrictEqual(await lookupWebFinger("acct:johndoe@example.com"), null); - }); + const mailtoQueryExpected: ResourceDescriptor = { + subject: "mailto:juliet@example.com?subject=Hi", + links: [], + }; + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=mailto%3Ajuliet%40example.com%3Fsubject%3DHi", + { body: mailtoQueryExpected }, + ); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://localhost/.well-known/webfinger?", - { - subject: "acct:test@localhost", - links: [ - { - rel: "self", - type: "application/activity+json", - href: "https://localhost/actor", - }, - ], - }, - ); + await t.step("mailto with hfields", async () => { + // RFC 6068 §2 allows `mailto:` URIs to carry `?hfields=...` + // header fields and fragment identifiers. Unlike `acct:`, + // those components are part of the grammar, so the lookup + // must accept them and forward the full resource URI to the + // WebFinger endpoint. + deepStrictEqual( + await lookupWebFinger("mailto:juliet@example.com?subject=Hi"), + mailtoQueryExpected, + ); + }); - await t.step("private address", async () => { - deepStrictEqual(await lookupWebFinger("acct:test@localhost"), null); - deepStrictEqual( - await lookupWebFinger("acct:test@localhost", { - allowPrivateAddress: true, - }), + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { body: "not json" }, + ); + + await t.step("invalid response", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + null, + ); + }); + + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://localhost/.well-known/webfinger?", { subject: "acct:test@localhost", links: [ @@ -115,217 +166,702 @@ test({ ], }, ); - }); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { - status: 302, - headers: { Location: "/.well-known/webfinger2" }, - }, - ); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger2", - { body: expected }, - ); + await t.step("private address", async () => { + deepStrictEqual(await lookupWebFinger("acct:test@localhost"), null); + deepStrictEqual( + await lookupWebFinger("acct:test@localhost", { + allowPrivateAddress: true, + }), + { + subject: "acct:test@localhost", + links: [ + { + rel: "self", + type: "application/activity+json", + href: "https://localhost/actor", + }, + ], + }, + ); + }); - await t.step("redirection", async () => { - deepStrictEqual( - await lookupWebFinger("acct:johndoe@example.com"), - expected, + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "/.well-known/webfinger2" }, + }, + ); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger2", + { body: expected }, ); - }); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { - status: 302, - headers: { Location: "/.well-known/webfinger" }, - }, - ); + await t.step("redirection", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + expected, + ); + }); - await t.step("infinite redirection", async () => { - const result = await withTimeout( - () => lookupWebFinger("acct:johndoe@example.com"), - 2000, + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "/.well-known/webfinger" }, + }, ); - deepStrictEqual(result, null); - }); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { - status: 302, - headers: { Location: "ftp://example.com/" }, - }, - ); + await t.step("infinite redirection", async () => { + const result = await withTimeout( + () => lookupWebFinger("acct:johndoe@example.com"), + 2000, + ); + deepStrictEqual(result, null); + }); - await t.step("redirection to different protocol", async () => { - deepStrictEqual(await lookupWebFinger("acct:johndoe@example.com"), null); - }); + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "ftp://example.com/" }, + }, + ); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { - status: 302, - headers: { Location: "https://localhost/" }, - }, - ); + await t.step("redirection to different protocol", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + null, + ); + }); - await t.step("redirection to private address", async () => { - deepStrictEqual(await lookupWebFinger("acct:johndoe@example.com"), null); - }); + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "https://localhost/" }, + }, + ); - fetchMock.removeRoutes(); - let redirectCount = 0; - fetchMock.get( - "begin:https://example.com/.well-known/webfinger", - () => { - redirectCount++; - if (redirectCount < 3) { - return { - status: 302, - headers: { - Location: `/.well-known/webfinger?redirect=${redirectCount}`, + await t.step("redirection to private address", async () => { + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + null, + ); + }); + + fetchMock.removeRoutes(); + let redirectCount = 0; + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + () => { + redirectCount++; + if (redirectCount < 3) { + return { + status: 302, + headers: { + Location: `/.well-known/webfinger?redirect=${redirectCount}`, + }, + }; + } + return { body: expected }; + }, + ); + + await t.step("custom maxRedirection", async () => { + // Test with maxRedirection: 1 (should fail; mock has 2 redirects) + redirectCount = 0; + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com", { + maxRedirection: 1, + }), + null, + ); + + // Test with maxRedirection: 2 (should succeed; mock has exactly 2 + // redirects, and `maxRedirection: N` follows up to N redirects) + redirectCount = 0; + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com", { + maxRedirection: 2, + }), + expected, + ); + + // Test with maxRedirection: 3 (should succeed) + redirectCount = 0; + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com", { + maxRedirection: 3, + }), + expected, + ); + + // Test with default maxRedirection: 5 (should succeed) + redirectCount = 0; + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com"), + expected, + ); + }); + + // Regression: `maxRedirection: 1` must allow exactly one 302 to be + // followed. An earlier implementation incremented the counter before + // the `>=` check, so `maxRedirection: 1` rejected the first redirect + // instead of following it. The expected semantics is "follow up to + // N redirects". + await t.step( + "maxRedirection: 1 follows exactly one redirect", + async () => { + // Mock with a single redirect: 302 → 200. Under the corrected + // semantics, `maxRedirection: 1` follows it and reaches the body. + fetchMock.removeRoutes(); + let count = 0; + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + () => { + count++; + return count < 2 + ? { + status: 302, + headers: { Location: "/.well-known/webfinger?after=1" }, + } + : { body: expected }; + }, + ); + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com", { + maxRedirection: 1, + }), + expected, + ); + + // Mock with two redirects. `maxRedirection: 1` rejects the + // second redirect. + fetchMock.removeRoutes(); + count = 0; + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + () => { + count++; + return count < 3 + ? { + status: 302, + headers: { + Location: `/.well-known/webfinger?after=${count}`, + }, + } + : { body: expected }; }, - }; - } - return { body: expected }; + ); + deepStrictEqual( + await lookupWebFinger("acct:johndoe@example.com", { + maxRedirection: 1, + }), + null, + ); + }, + ); + + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + () => + new Promise((resolve) => { + const timeoutId = setTimeout(() => { + resolve({ body: expected }); + }, 1000); + + return () => clearTimeout(timeoutId); + }), + ); + + await t.step("request cancellation", async () => { + // Test cancelling a request immediately using AbortController + const controller = new AbortController(); + const promise = lookupWebFinger("acct:johndoe@example.com", { + signal: controller.signal, + }); + + // Abort the request right after starting it + controller.abort(); + deepStrictEqual(await promise, null); + }); + + fetchMock.removeRoutes(); + let redirectCount2 = 0; + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + () => { + redirectCount2++; + if (redirectCount2 === 1) { + return { + status: 302, + headers: { Location: "/.well-known/webfinger2" }, + }; + } + return new Promise((resolve) => { + const timeoutId = setTimeout(() => { + resolve({ body: expected }); + }, 1000); + + return () => clearTimeout(timeoutId); + }); + }, + ); + + await t.step("cancellation during redirection", async () => { + // Test cancelling a request during redirection process + const controller = new AbortController(); + const promise = lookupWebFinger("acct:johndoe@example.com", { + signal: controller.signal, + }); + + // Cancel during the delayed second request after redirection + setTimeout(() => controller.abort(), 100); + deepStrictEqual(await promise, null); + }); + + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + () => + new Promise((resolve) => { + const timeoutId = setTimeout(() => { + resolve({ body: expected }); + }, 500); + + return () => clearTimeout(timeoutId); + }), + ); + + await t.step("cancellation with immediate abort", async () => { + // Test starting a request with an already aborted AbortController + const controller = new AbortController(); + controller.abort(); + + // Use a signal that was already aborted before starting the request + const result = await lookupWebFinger("acct:johndoe@example.com", { + signal: controller.signal, + }); + deepStrictEqual(result, null); + }); + + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { body: expected }, + ); + + await t.step("successful request with signal", async () => { + // Test successful request with a normal AbortController signal + const controller = new AbortController(); + const result = await lookupWebFinger("acct:johndoe@example.com", { + signal: controller.signal, + }); + deepStrictEqual(result, expected); + }); + } finally { + fetchMock.removeRoutes(); + fetchMock.hardReset(); + } + }, +}); + +test("lookupWebFinger() records webfinger.lookup counter and duration", { + sanitizeOps: false, + sanitizeResources: false, +}, async (t) => { + fetchMock.spyGlobal(); + try { + const expected: ResourceDescriptor = { + subject: "acct:johndoe@example.com", + links: [], + }; + + await t.step( + "records result=found for a successful acct lookup", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=acct%3Ajohndoe%40example.com", + { body: expected }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + const result = await lookupWebFinger("acct:johndoe@example.com", { + meterProvider, + }); + deepStrictEqual(result, expected); + + const counters = recorder.getMeasurements("webfinger.lookup"); + deepStrictEqual(counters.length, 1); + deepStrictEqual(counters[0].type, "counter"); + deepStrictEqual(counters[0].value, 1); + deepStrictEqual( + counters[0].attributes["webfinger.lookup.result"], + "found", + ); + deepStrictEqual( + counters[0].attributes["webfinger.resource.scheme"], + "acct", + ); + deepStrictEqual( + counters[0].attributes["activitypub.remote.host"], + "example.com", + ); + deepStrictEqual( + counters[0].attributes["http.response.status_code"], + 200, + ); + + const durations = recorder.getMeasurements("webfinger.lookup.duration"); + deepStrictEqual(durations.length, 1); + deepStrictEqual(durations[0].type, "histogram"); + deepStrictEqual( + durations[0].attributes["webfinger.lookup.result"], + "found", + ); + deepStrictEqual( + durations[0].attributes["webfinger.resource.scheme"], + "acct", + ); + ok(typeof durations[0].value === "number" && durations[0].value >= 0); + }, + ); + + await t.step( + "records scheme=https for an https resource lookup", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=https%3A%2F%2Fexample.com%2Ffoo", + { body: { subject: "https://example.com/foo", links: [] } }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("https://example.com/foo", { meterProvider }); + const counters = recorder.getMeasurements("webfinger.lookup"); + deepStrictEqual(counters.length, 1); + deepStrictEqual( + counters[0].attributes["webfinger.resource.scheme"], + "https", + ); + deepStrictEqual( + counters[0].attributes["webfinger.lookup.result"], + "found", + ); }, ); - await t.step("custom maxRedirection", async () => { - // Test with maxRedirection: 2 (should fail) - redirectCount = 0; + await t.step("records result=not_found with status 404", async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { status: 404 }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + const result = await lookupWebFinger("acct:johndoe@example.com", { + meterProvider, + }); + deepStrictEqual(result, null); + + const counters = recorder.getMeasurements("webfinger.lookup"); + deepStrictEqual(counters.length, 1); deepStrictEqual( - await lookupWebFinger("acct:johndoe@example.com", { - maxRedirection: 2, - }), - null, + counters[0].attributes["webfinger.lookup.result"], + "not_found", + ); + deepStrictEqual( + counters[0].attributes["http.response.status_code"], + 404, + ); + deepStrictEqual( + counters[0].attributes["activitypub.remote.host"], + "example.com", ); - // Test with maxRedirection: 3 (should succeed) - redirectCount = 0; + const durations = recorder.getMeasurements("webfinger.lookup.duration"); + deepStrictEqual(durations.length, 1); deepStrictEqual( - await lookupWebFinger("acct:johndoe@example.com", { - maxRedirection: 3, - }), - expected, + durations[0].attributes["webfinger.lookup.result"], + "not_found", ); + }); - // Test with default maxRedirection: 5 (should succeed) - redirectCount = 0; + await t.step("records result=not_found with status 410", async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { status: 410 }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); deepStrictEqual( - await lookupWebFinger("acct:johndoe@example.com"), - expected, + counter.attributes["webfinger.lookup.result"], + "not_found", ); + deepStrictEqual(counter.attributes["http.response.status_code"], 410); }); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - () => - new Promise((resolve) => { - const timeoutId = setTimeout(() => { - resolve({ body: expected }); - }, 1000); - - return () => clearTimeout(timeoutId); - }), + await t.step( + "records result=error for non-2xx, non-404/410 HTTP responses", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { status: 500 }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "error", + ); + deepStrictEqual(counter.attributes["http.response.status_code"], 500); + }, ); - await t.step("request cancellation", async () => { - // Test cancelling a request immediately using AbortController - const controller = new AbortController(); - const promise = lookupWebFinger("acct:johndoe@example.com", { - signal: controller.signal, - }); - - // Abort the request right after starting it - controller.abort(); - deepStrictEqual(await promise, null); - }); - - fetchMock.removeRoutes(); - let redirectCount2 = 0; - fetchMock.get( - "begin:https://example.com/.well-known/webfinger", - () => { - redirectCount2++; - if (redirectCount2 === 1) { - return { - status: 302, - headers: { Location: "/.well-known/webfinger2" }, - }; - } - return new Promise((resolve) => { - const timeoutId = setTimeout(() => { - resolve({ body: expected }); - }, 1000); - - return () => clearTimeout(timeoutId); - }); + await t.step( + "records result=invalid for malformed JSON bodies", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { body: "not json" }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "invalid", + ); + deepStrictEqual(counter.attributes["http.response.status_code"], 200); }, ); - await t.step("cancellation during redirection", async () => { - // Test cancelling a request during redirection process - const controller = new AbortController(); - const promise = lookupWebFinger("acct:johndoe@example.com", { - signal: controller.signal, - }); + await t.step( + "records result=network_error when fetch never reaches the remote", + async () => { + fetchMock.removeRoutes(); + const [meterProvider, recorder] = createTestMeterProvider(); + const result = await lookupWebFinger( + "acct:johndoe@fedify-test.internal", + { meterProvider }, + ); + deepStrictEqual(result, null); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "network_error", + ); + deepStrictEqual( + "http.response.status_code" in counter.attributes, + false, + "no HTTP response means no status code attribute", + ); + deepStrictEqual( + counter.attributes["activitypub.remote.host"], + "fedify-test.internal", + ); + }, + ); - // Cancel during the delayed second request after redirection - setTimeout(() => controller.abort(), 100); - deepStrictEqual(await promise, null); - }); + await t.step( + "records result=invalid for malformed acct: resources", + async () => { + const [meterProvider, recorder] = createTestMeterProvider(); + const result = await lookupWebFinger("acct:johndoe", { meterProvider }); + deepStrictEqual(result, null); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "invalid", + ); + deepStrictEqual( + counter.attributes["webfinger.resource.scheme"], + "acct", + ); + deepStrictEqual( + "activitypub.remote.host" in counter.attributes, + false, + "a malformed acct resource has no usable remote host", + ); + }, + ); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - () => - new Promise((resolve) => { - const timeoutId = setTimeout(() => { - resolve({ body: expected }); - }, 500); - - return () => clearTimeout(timeoutId); - }), + await t.step( + "records result=invalid when the redirect chain exceeds maxRedirection", + async () => { + fetchMock.removeRoutes(); + // The redirect Location drops the original `?resource=...` query + // string, so the second hop's URL no longer contains a `?`. The + // route pattern omits the trailing `?` so it still matches. + fetchMock.get( + "begin:https://example.com/.well-known/webfinger", + { + status: 302, + headers: { Location: "/.well-known/webfinger" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + const result = await withTimeout( + () => + lookupWebFinger("acct:johndoe@example.com", { + meterProvider, + maxRedirection: 3, + }), + 2000, + ); + deepStrictEqual(result, null); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "invalid", + ); + deepStrictEqual(counter.attributes["http.response.status_code"], 302); + deepStrictEqual( + counter.attributes["activitypub.remote.host"], + "example.com", + ); + }, ); - await t.step("cancellation with immediate abort", async () => { - // Test starting a request with an already aborted AbortController - const controller = new AbortController(); - controller.abort(); + await t.step( + "records result=invalid for cross-protocol redirects", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "ftp://example.com/" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "invalid", + ); + deepStrictEqual(counter.attributes["http.response.status_code"], 302); + }, + ); - // Use a signal that was already aborted before starting the request - const result = await lookupWebFinger("acct:johndoe@example.com", { - signal: controller.signal, - }); - deepStrictEqual(result, null); - }); + await t.step( + "records result=network_error when a redirect points to a private address", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "https://localhost/" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "network_error", + ); + deepStrictEqual( + counter.attributes["activitypub.remote.host"], + "localhost", + "remote.host reflects the latest URL we attempted, even after a redirect", + ); + }, + ); - fetchMock.removeRoutes(); - fetchMock.get( - "begin:https://example.com/.well-known/webfinger?", - { body: expected }, + await t.step( + "records result=invalid for malformed Location headers", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "begin:https://example.com/.well-known/webfinger?", + { + status: 302, + headers: { Location: "http://[bad" }, + }, + ); + const [meterProvider, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.lookup.result"], + "invalid", + ); + deepStrictEqual(counter.attributes["http.response.status_code"], 302); + deepStrictEqual( + counter.attributes["activitypub.remote.host"], + "example.com", + ); + }, ); - await t.step("successful request with signal", async () => { - // Test successful request with a normal AbortController signal - const controller = new AbortController(); - const result = await lookupWebFinger("acct:johndoe@example.com", { - signal: controller.signal, - }); - deepStrictEqual(result, expected); - }); + await t.step( + "buckets unknown resource schemes as 'other' to keep metric cardinality bounded", + async () => { + // Lookups whose redirect chain ends on an unusual scheme (or a + // resource the caller passes with a non-fediverse scheme) must + // not leak that scheme into the metric attribute. + fetchMock.removeRoutes(); + const [meterProvider, recorder] = createTestMeterProvider(); + // `ssh:` is not a WebFinger scheme; lookupWebFingerInternal will + // attempt to build a host from the URL, fail, and return null. + // The metric still records, and its scheme attribute must be + // bucketed as `other`. + await lookupWebFinger("ssh://example.com/foo", { meterProvider }); + const counter = recorder.getMeasurement("webfinger.lookup"); + ok(counter != null); + deepStrictEqual( + counter.attributes["webfinger.resource.scheme"], + "other", + ); + }, + ); + await t.step( + "omits measurements when no meterProvider is provided", + async () => { + fetchMock.removeRoutes(); + fetchMock.get( + "https://example.com/.well-known/webfinger?resource=acct%3Ajohndoe%40example.com", + { body: expected }, + ); + const [_unused, recorder] = createTestMeterProvider(); + await lookupWebFinger("acct:johndoe@example.com"); + deepStrictEqual( + recorder.getMeasurements("webfinger.lookup").length, + 0, + ); + deepStrictEqual( + recorder.getMeasurements("webfinger.lookup.duration").length, + 0, + ); + }, + ); + } finally { + fetchMock.removeRoutes(); fetchMock.hardReset(); - }, + } }); // cSpell: ignore johndoe diff --git a/packages/webfinger/src/lookup.ts b/packages/webfinger/src/lookup.ts index 276d43937..ac5238c50 100644 --- a/packages/webfinger/src/lookup.ts +++ b/packages/webfinger/src/lookup.ts @@ -6,6 +6,10 @@ import { } from "@fedify/vocab-runtime"; import { getLogger } from "@logtape/logtape"; import { + type Attributes, + type Counter, + type Histogram, + type MeterProvider, SpanKind, SpanStatusCode, trace, @@ -18,6 +22,114 @@ const logger = getLogger(["fedify", "webfinger", "lookup"]); const DEFAULT_MAX_REDIRECTION = 5; +/** + * The terminal classification of an outgoing {@link lookupWebFinger} call, + * recorded as the `webfinger.lookup.result` attribute on the + * `webfinger.lookup` counter and `webfinger.lookup.duration` histogram. + * + * - `found`: the lookup returned a {@link ResourceDescriptor}. + * - `not_found`: the remote responded with HTTP `404 Not Found` or + * `410 Gone`. Recorded together with `http.response.status_code`. + * - `invalid`: the remote responded with content Fedify could not parse + * into a {@link ResourceDescriptor} (JSON parse failure), the + * redirect chain exceeded `maxRedirection`, the remote redirected to + * a different protocol, or the queried `acct:` resource itself was + * malformed. + * - `network_error`: no HTTP response was received (the URL was + * rejected as a private address, `fetch()` threw, or an `AbortError` + * cancelled the request). + * - `error`: the remote responded with a non-2xx HTTP response that is + * neither `404` nor `410`, or any other unexpected failure bubbled up + * from the lookup. + * @since 2.3.0 + */ +export type WebFingerLookupResult = + | "found" + | "not_found" + | "invalid" + | "network_error" + | "error"; + +interface WebFingerInstruments { + lookup: Counter; + lookupDuration: Histogram; +} + +const WEBFINGER_HISTOGRAM_BUCKETS: ReadonlyArray = [ + 5, + 10, + 25, + 50, + 75, + 100, + 250, + 500, + 750, + 1000, + 2500, + 5000, + 7500, + 10000, +]; + +const webFingerInstruments = new WeakMap(); + +function getWebFingerInstruments( + meterProvider: MeterProvider, +): WebFingerInstruments { + let instruments = webFingerInstruments.get(meterProvider); + if (instruments == null) { + const meter = meterProvider.getMeter(metadata.name, metadata.version); + instruments = { + lookup: meter.createCounter("webfinger.lookup", { + description: "Outgoing WebFinger lookup attempts.", + unit: "{lookup}", + }), + lookupDuration: meter.createHistogram("webfinger.lookup.duration", { + description: "Duration of outgoing WebFinger lookups.", + unit: "ms", + advice: { explicitBucketBoundaries: [...WEBFINGER_HISTOGRAM_BUCKETS] }, + }), + }; + webFingerInstruments.set(meterProvider, instruments); + } + return instruments; +} + +function getResourceScheme(resource: URL | string): string { + if (typeof resource === "string") { + const colon = resource.indexOf(":"); + return colon > 0 ? resource.substring(0, colon).toLowerCase() : ""; + } + return resource.protocol.replace(/:$/, "").toLowerCase(); +} + +// The scheme attribute is recorded on the `webfinger.lookup` metric. Even +// though most call sites pass scheme-controlled resources (Fedify code and +// library users), `lookupObject()` accepts user-supplied identifiers that +// flow into here, so the metric attribute is bucketed to the schemes +// WebFinger / fediverse clients legitimately use (RFC 7565 + +// ActivityPub). Anything else is bucketed as `other`, keeping metric +// cardinality bounded even when a remote returns redirects whose target +// scheme is unusual. +const WEBFINGER_LOOKUP_SCHEME_WHITELIST: ReadonlySet = new Set([ + "acct", + "http", + "https", + "mailto", +]); + +function getMetricResourceScheme(scheme: string): string { + return WEBFINGER_LOOKUP_SCHEME_WHITELIST.has(scheme) ? scheme : "other"; +} + +interface WebFingerLookupOutcome { + resource: ResourceDescriptor | null; + result: WebFingerLookupResult; + statusCode?: number; + remoteHost?: string; +} + /** * Options for {@link lookupWebFinger}. * @since 1.3.0 @@ -54,6 +166,16 @@ export interface LookupWebFingerOptions { */ tracerProvider?: TracerProvider; + /** + * The OpenTelemetry meter provider used to record the `webfinger.lookup` + * counter and `webfinger.lookup.duration` histogram. If omitted, no + * metric measurements are emitted (the helper is opt-in to avoid + * touching the global meter provider for callers that do not use + * OpenTelemetry). + * @since 2.3.0 + */ + meterProvider?: MeterProvider; + /** * AbortSignal for cancelling the request. * @since 1.8.0 @@ -77,24 +199,35 @@ export async function lookupWebFinger( metadata.name, metadata.version, ); + const scheme = getResourceScheme(resource); return await tracer.startActiveSpan( "webfinger.lookup", { kind: SpanKind.CLIENT, attributes: { "webfinger.resource": resource.toString(), - "webfinger.resource.scheme": typeof resource === "string" - ? resource.replace(/:.*$/, "") - : resource.protocol.replace(/:$/, ""), + "webfinger.resource.scheme": scheme, }, }, async (span) => { + const meterProvider = options.meterProvider; + const start = meterProvider == null ? 0 : performance.now(); + // Initialise the outcome with the `error` shape that the `finally` + // block records when `lookupWebFingerInternal()` itself rejects; + // the `try` body reassigns this to the actual outcome before any + // other statement runs, so the `catch` does not need to reassign. + let outcome: WebFingerLookupOutcome = { + resource: null, + result: "error", + }; try { - const result = await lookupWebFingerInternal(resource, options); + outcome = await lookupWebFingerInternal(resource, options); span.setStatus({ - code: result === null ? SpanStatusCode.ERROR : SpanStatusCode.OK, + code: outcome.resource === null + ? SpanStatusCode.ERROR + : SpanStatusCode.OK, }); - return result; + return outcome.resource; } catch (error) { span.setStatus({ code: SpanStatusCode.ERROR, @@ -102,24 +235,73 @@ export async function lookupWebFinger( }); throw error; } finally { + if (meterProvider != null) { + const durationMs = Math.max(0, performance.now() - start); + recordWebFingerLookup(meterProvider, durationMs, scheme, outcome); + } span.end(); } }, ); } +function recordWebFingerLookup( + meterProvider: MeterProvider, + durationMs: number, + scheme: string, + outcome: WebFingerLookupOutcome, +): void { + const attributes: Attributes = { + "webfinger.lookup.result": outcome.result, + "webfinger.resource.scheme": getMetricResourceScheme(scheme), + }; + if (outcome.remoteHost != null) { + attributes["activitypub.remote.host"] = outcome.remoteHost; + } + if (outcome.statusCode != null) { + attributes["http.response.status_code"] = outcome.statusCode; + } + const instruments = getWebFingerInstruments(meterProvider); + instruments.lookup.add(1, attributes); + instruments.lookupDuration.record(durationMs, attributes); +} + async function lookupWebFingerInternal( resource: URL | string, options: LookupWebFingerOptions = {}, -): Promise { +): Promise { if (typeof resource === "string") resource = new URL(resource); let protocol = "https:"; let server: string; - if (resource.protocol === "acct:") { + if (resource.protocol === "acct:" || resource.protocol === "mailto:") { + // `acct:` (RFC 7565) and `mailto:` (RFC 6068, used as a WebFinger + // resource per RFC 7033 §4.5) are opaque-path schemes: their + // `user@host` authority lives in `pathname`, not in `host`. The + // WebFinger host is extracted from the substring after the last + // `@`, and the lookup always goes to https on that host. const atPos = resource.pathname.lastIndexOf("@"); - if (atPos < 0) return null; + if (atPos < 0) return { resource: null, result: "invalid" }; server = resource.pathname.substring(atPos + 1); - if (server === "") return null; + // The authority part of both schemes must be a bare host: no + // path, query, or fragment characters embedded in it. The + // WHATWG URL parser routes anything after the first `?` or `#` + // into `search` / `hash`, so by the time we read `pathname` the + // only stray characters that can land in `server` are slashes. + // Reject those (along with an empty authority) for both schemes. + if (server === "" || /[/?#]/.test(server)) { + return { resource: null, result: "invalid" }; + } + // `acct:` (RFC 7565 §3) is bare authority only: no `search` or + // `hash` allowed. `mailto:` (RFC 6068 §2) explicitly permits + // `?hfields=…` header fields and fragment identifiers, so we + // forward those to the remote WebFinger lookup unchanged and + // only enforce the stricter shape for `acct:`. + if ( + resource.protocol === "acct:" && + (resource.search !== "" || resource.hash !== "") + ) { + return { resource: null, result: "invalid" }; + } } else { protocol = resource.protocol; server = resource.host; @@ -128,6 +310,7 @@ async function lookupWebFingerInternal( url.searchParams.set("resource", resource.href); let redirected = 0; while (true) { + const remoteHost = url.hostname; logger.debug( "Fetching WebFinger resource descriptor from {url}...", { url: url.href }, @@ -142,7 +325,7 @@ async function lookupWebFingerInternal( "Invalid URL for WebFinger resource descriptor: {error}", { error: e }, ); - return null; + return { resource: null, result: "network_error", remoteHost }; } throw e; } @@ -163,7 +346,7 @@ async function lookupWebFingerInternal( "Failed to fetch WebFinger resource descriptor: {error}", { url: url.href, error }, ); - return null; + return { resource: null, result: "network_error", remoteHost }; } if ( response.status >= 300 && response.status < 400 && @@ -171,18 +354,44 @@ async function lookupWebFingerInternal( ) { redirected++; const maxRedirection = options.maxRedirection ?? DEFAULT_MAX_REDIRECTION; - if (redirected >= maxRedirection) { + // `maxRedirection: N` is documented as "the maximum number of + // redirections to follow", so the Nth redirect must still be + // followed and the (N+1)th rejected. An earlier version used + // `>=` here, which drifted by one from the documented semantics + // and from the sibling code in @fedify/vocab-runtime's document + // loader. + if (redirected > maxRedirection) { logger.error( "Too many redirections ({redirections}) while fetching WebFinger " + "resource descriptor.", { redirections: redirected }, ); - return null; + return { + resource: null, + result: "invalid", + statusCode: response.status, + remoteHost, + }; + } + let redirectedUrl: URL; + try { + redirectedUrl = new URL( + response.headers.get("Location")!, + response.url == null || response.url === "" ? url : response.url, + ); + } catch (e) { + logger.error( + "Invalid Location header while following WebFinger redirect: " + + "{error}", + { url: url.href, error: e }, + ); + return { + resource: null, + result: "invalid", + statusCode: response.status, + remoteHost, + }; } - const redirectedUrl = new URL( - response.headers.get("Location")!, - response.url == null || response.url === "" ? url : response.url, - ); if (redirectedUrl.protocol !== url.protocol) { logger.error( "Redirected to a different protocol ({protocol} to " + @@ -193,7 +402,12 @@ async function lookupWebFingerInternal( redirectedProtocol: redirectedUrl.protocol, }, ); - return null; + return { + resource: null, + result: "invalid", + statusCode: response.status, + remoteHost, + }; } url = redirectedUrl; continue; @@ -207,17 +421,34 @@ async function lookupWebFingerInternal( statusText: response.statusText, }, ); - return null; + const isNotFound = response.status === 404 || response.status === 410; + return { + resource: null, + result: isNotFound ? "not_found" : "error", + statusCode: response.status, + remoteHost, + }; } try { - return await response.json() as ResourceDescriptor; + const parsed = await response.json() as ResourceDescriptor; + return { + resource: parsed, + result: "found", + statusCode: response.status, + remoteHost, + }; } catch (e) { if (e instanceof SyntaxError) { logger.debug( "Failed to parse WebFinger resource descriptor as JSON: {error}", { error: e }, ); - return null; + return { + resource: null, + result: "invalid", + statusCode: response.status, + remoteHost, + }; } throw e; }