Skip to content

feat: filters are not search aware by default; accelerated by MV#2272

Merged
kodiakhq[bot] merged 16 commits into
mainfrom
aaron/filters-mv-usage
May 20, 2026
Merged

feat: filters are not search aware by default; accelerated by MV#2272
kodiakhq[bot] merged 16 commits into
mainfrom
aaron/filters-mv-usage

Conversation

@knudtty
Copy link
Copy Markdown
Contributor

@knudtty knudtty commented May 13, 2026

Summary

Filters now are not search aware by default. They can/will use metadata MVs if available.

How to test on local

Steps:

  1. Start up a fresh HyperDX Instance via yarn dev
  2. View the filters loading faster
  3. Adjust the filters value in the team settings

References

  • Linear Issue: Closes HDX-4227

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 13, 2026

🦋 Changeset detected

Latest commit: eb53d1e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@hyperdx/common-utils Patch
@hyperdx/api Patch
@hyperdx/app Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 20, 2026 7:51pm

Request Review

@github-actions github-actions Bot added the review/tier-3 Standard — full human review required label May 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (1):
    • packages/api/src/routers/api/team.ts
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 17
  • Production lines changed: 760 (+ 501 in test files, excluded from tier calculation)
  • Branch: aaron/filters-mv-usage
  • Author: knudtty

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

E2E Test Results

All tests passed • 176 passed • 3 skipped • 1257s

Status Count
✅ Passed 176
❌ Failed 0
⚠️ Flaky 6
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

PR Review

Reviewed the filters MV usage PR (1150 additions across 22 files). The design is solid — combined key+value rollup query, dual-pipeline UX toggle, settings reset via null$unset, and date-range threading through getMapKeys/getMapValues/getAllKeyValues are all well thought out. A few items worth a look before merge:

  • ⚠️ maxKeys mismatch between filters UI and useMultipleGetKeyValues → The component computes maxKeys = me?.team?.filterKeysFetchLimit ?? (hasMVs ? 100 : 20) (DBSearchPageFilters.tsx:1170-1174), but useMultipleGetKeyValues internally re-slices via keys.slice(0, maxKeys) using only DEFAULT_FILTER_KEYS_FETCH_LIMIT (20) as its fallback (useMetadata.tsx:254-255). In exact-mode with MVs enabled and no explicit team setting, the UI thinks it's fetching up to 100 keys but the hook silently caps at 20. Either share the default constant or pass maxKeys through.

  • ⚠️ Silent rollup batch failure in getAllKeyValues → When the batched rollup query throws, batchResults stays [] and every key cascades to per-key getMapValues calls — only a console.warn signals the failure (metadata.ts:1393-1397). On a degraded ClickHouse, this turns one failed query into N main-table scans with no user-visible indication. Consider either surfacing the error or guarding the fallback behind a clearer condition (e.g. only fall back per-key when the row simply isn't in the rollup, not on a thrown error).

  • ⚠️ useAllFieldsAndValues typed as any at the React Query level → The hook signature uses UseQueryOptions<any, Error> (useMetadata.tsx:876) but the actual useQuery generic is { key: string; value: string[] }[]. Tighten the options type to match (UseQueryOptions<{ key: string; value: string[] }[], Error>) so callers get correct inference on data.

  • ℹ️ e2e test removed openFilterGroup calls (saved-search.spec.ts:639) without an obvious matching default-state change in DBSearchPageFilters. If filter groups still default to collapsed in some code paths, this test will be flaky — worth a quick local run to confirm restoration now leaves groups open.

  • ℹ️ TeamClickHouseSettingsUpdateSchema accepts null on every field via .nullish() and the controller maps null$unset. Zod's default strip behavior on z.object prevents arbitrary keys, so the $set/$unset build is safe — just worth noting that any future migration adding non-resetable fields needs to opt out of .nullish() explicitly.

No blocking security issues. The two ⚠️ items (maxKeys default divergence and silent batch fallback) are the ones I'd address before merging.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

Deep Review

This PR makes filters non-search-aware by default (localStorage hdx-show-all-filter-values defaults true), batches getAllKeyValues across keys, adds getAllFieldsAndValues rollup query, threads timestampValueExpression/dateRange through filter and autocomplete pipelines, and lets the team-settings PATCH endpoint use null to reset a value. The MV-accelerated path is well-shaped; the main risks cluster around what happens when MVs are absent, partial, or failing.

🔴 P0/P1 -- must fix

  • packages/app/src/components/DBSearchPageFilters.tsx:1271-1297 -- With showAllValues=true (the new default) and no MVs configured, facetsChartConfig strips where: '' and filters: [], so useGetKeyValues now issues per-key DISTINCT scans over the full date range stripped of the user's search, instead of the previously WHERE-scoped query.
    • Fix: Gate the WHERE/filters strip behind hasMVs so accounts without MVs keep the prior search-aware facet query, or rate-limit / shrink keysToFetch when stripping and there are no MVs.
    • adversarial, performance, maintainability
  • packages/common-utils/src/core/metadata.ts:1397-1437 -- When the batched rollup query throws (catch swallows the error) or returns rows for only some keys, the fallback path runs Promise.all(parsed.map(p => getMapValues(...))) against the main table with no concurrency cap, fanning out to up to 100 parallel DISTINCT scans (the new MV default) on every render.
    • Fix: Bound the fallback with a concurrency limiter (e.g. p-limit ≈ 8) and short-circuit the per-key fallback when the batched query succeeded but simply returned no rows for a key.
    • correctness, performance, reliability, adversarial

🟡 P2 -- recommended

  • packages/common-utils/src/core/metadata.ts:1521-1530 -- getAllFieldsAndValues runs with timeout_overflow_mode: 'break' and a 30s execution cap, so a timed-out query returns truncated rows that then sit in the in-memory cache and React Query's 5-minute staleTime with no error surfaced.
    • Fix: Switch to timeout_overflow_mode: 'throw' (or inspect the response statistics and skip caching truncated results) so a partial response degrades to an error the UI can surface instead of being cached as the source of truth.
    • performance, reliability
  • packages/common-utils/src/core/metadata.ts:1513-1525 -- getAllFieldsAndValues applies LIMIT ${maxKeys} after ORDER BY ColumnIdentifier = 'NativeColumn' DESC, ColumnIdentifier, Key, so when the rollup has more native columns than maxKeys, every map sub-field (ResourceAttributes['*'], LogAttributes['*']) is dropped from the filter UI.
    • Fix: Either interleave native and map sources within the maxKeys budget (e.g. allocate per-column quotas) or order so native and map keys are mixed and document the budget split in the team-settings tooltip.
    • adversarial
  • packages/common-utils/src/core/metadata.ts:1462-1535 -- getAllFieldsAndValues has no try/catch and no exact-pipeline fallback, so a rollup error (schema drift, transient ClickHouse failure) surfaces directly to React Query and DBSearchPageFilters shows zero filters until the user manually toggles Show All Values off.
    • Fix: On error in useAllFieldsAndValues, fall back to the exact pipeline automatically, or wrap the call so the MV failure degrades silently to useAllFields + useGetKeyValues.
    • reliability
  • packages/app/src/components/DBSearchPageFilters.tsx:1149-1265 -- Local dateRange state is now threaded into both useAllFields and useAllFieldsAndValues, but the sync effect at line 1261 only fires when !isLive, so during live tail the filter pipeline stays pinned to the dateRange captured when live tail began while the search query advances with NOW.
    • Fix: Refresh the filter pipeline's dateRange on a heartbeat during live tail, or pass chartConfig.dateRange directly to the field/value hooks and keep the local frozen dateRange only for the exact-pipeline useGetKeyValues call where it was used pre-PR.
    • adversarial
  • packages/app/src/components/DBSearchPageFilters.tsx:1300 -- mvFieldsAndValues flows straight into facets without applying the ['body', 'timestamp', '_hdx_body'] exclusion that keysToFetch enforces for the exact pipeline, so the MV pipeline can render Body and Timestamp as filterable high-cardinality keys.
    • Fix: Apply the same lowercase-name exclusion filter to mvFieldsAndValues before assigning to facets.
    • adversarial
  • packages/api/src/controllers/team.ts:95-115 -- The new $set/$unset partition in updateTeamClickhouseSettings is the load-bearing behavior of this PR's null-to-unset feature, but neither packages/api/src/controllers/__tests__/team.test.ts nor the router tests exercise it for null-only, mixed, or all-undefined bodies.
    • Fix: Add controller-level tests that mock Team.findByIdAndUpdate and assert the exact update argument for { k: null }, { k: v }, { a: 1, b: null }, and { k: undefined } inputs.
    • testing, api-contract, project-standards
  • packages/api/src/routers/api/team.ts:127 -- When the client sends { field: null } the controller $unsets the field, then the router runs pick(team, Object.keys(req.body)) which yields {} because the field is no longer on the document, so the response cannot distinguish "reset acknowledged" from "no-op".
    • Fix: Return the resolved value (the new default) on a null update rather than the absent key, or echo { field: null } so the client can detect the reset.
    • api-contract
🔵 P3 nitpicks (13)
  • packages/common-utils/src/core/metadata.ts:544-631 -- getJSONKeys returns [] on its first executable line per HDX-2480 but the PR threads dateRange/timestampValueExpression into the unreachable body and adds an AND-joined WHERE clause that no test exercises.
    • Fix: Revert the body changes until JSON keys are re-enabled, or replace the early return with a feature flag so the new code at least sometimes runs.
    • correctness, testing, maintainability, api-contract
  • packages/common-utils/src/core/metadata.ts:367-380 -- dateRangeCacheSuffix interpolates as the empty string when omitted, yielding cache keys like ${conn}.${db}.${table}.${column}..keys with a literal double dot.
    • Fix: Build the cache key by joining only the non-empty segments, e.g. [connectionId, databaseName, tableName, column, dateRangeCacheSuffix, 'keys'].filter(Boolean).join('.').
  • packages/common-utils/src/core/metadata.ts:374-512 -- The single shared cacheKey in getMapKeys is used for both the rollup getOrFetch and the main-table fallback getOrFetch, so an empty rollup result is cached as [] and the main-table fallback never runs on subsequent calls within the cache lifetime.
    • Fix: Differentiate the cache keys for rollup vs main-table branches, or refrain from caching [] from the rollup branch when a fallback would otherwise have produced data.
  • packages/app/src/components/DBSearchPageFilters.tsx:1118-1305, 1332-1394 -- The useExactPipeline boolean is consulted at six branch points (hook enables, data, isLoading, error, isFacetsLoading, isFacetsFetching, and the loadMoreFilterValuesForKey branch) which duplicates pipeline-selection logic and re-derives source/MV resolution inside the load-more callback.
    • Fix: Extract a single useFilterFacets({ pipeline }) hook that returns a unified { facets, isLoading, isFetching, error, loadMore } shape so the component has no useExactPipeline ternaries.
  • packages/app/src/components/SearchInput/SearchWhereInput.tsx:121-240 -- sourceId is now drilled through SearchWhereInputSearchInputV2 / SQLInlineEditoruseAutoCompleteOptions / useMultipleAllFields, and three components call useSource({ id: sourceId }) independently to extract a single string.
    • Fix: Pass timestampValueExpression?: string directly since it is the only field consumed, or introduce a lightweight SourceContext.Provider so descendants pull it from context without re-fetching.
  • packages/app/src/defaults.ts:7-8 -- DEFAULT_FILTER_KEYS_FETCH_LIMIT_WITH_MVS = 100 is applied as the fallback only in DBSearchPageFilters, while useGetKeyValues still falls back to DEFAULT_FILTER_KEYS_FETCH_LIMIT = 20, so an account with no team override gets different effective limits at different call sites despite the TeamSettings description promising one default-with-MVs value.
    • Fix: Add getDefaultFilterKeysLimit(hasMVs: boolean): number to defaults.ts and call it consistently in both DBSearchPageFilters and useGetKeyValues.
  • packages/app/src/hooks/useMetadata.tsx:401-419 -- useAllFieldsAndValues types its options as Omit<UseQueryOptions<any, Error>, 'queryKey'> while the body annotates the data shape as { key: string; value: string[] }[], so callbacks passed via options (select, placeholderData, etc.) lose type checking.
    • Fix: Type options as Omit<UseQueryOptions<{ key: string; value: string[] }[], Error>, 'queryKey'>.
  • packages/api/src/controllers/team.ts:95-115 -- updateTeamClickhouseSettings declares $set, $unset, and update as Record<string, any>, losing the schema correlation between keys and values that the new partition is built to enforce.
    • Fix: Type the partitions as Partial<TeamClickHouseSettings> for $set and Partial<Record<keyof TeamClickHouseSettings, ''>> for $unset, with update as UpdateQuery<TeamDocument>.
  • packages/app/src/hooks/useMetadata.tsx:127-158 -- The useMultipleAllFields queryKey changed from two flat getTime() numbers to a nested [start, end] tuple plus timestampValueExpression, so every previously cached entry becomes a permanent miss after deploy.
    • Fix: Either accept the one-time cache-miss avalanche and note it in the PR description, or keep the flat shape (...dateRange ? [start.getTime(), end.getTime()] : [undefined, undefined], timestampValueExpression).
  • packages/common-utils/src/types.ts:1161, 1171 -- TeamClickHouseSettingsSchema and TeamClickHouseSettingsUpdateSchema now coexist with near-identical shapes and overlapping names, making it easy for a future contributor to validate a new mutation route with the wrong schema.
    • Fix: Document with comments that the original schema is for response/projection and the update schema is for mutation input, or replace the original with a .partial() derivation of the update schema.
  • packages/app/src/components/DBSearchPageFilters.tsx:1126-1131 -- useLocalStorage('hdx-show-all-filter-values', true) flips the filter pane to "show all values" for every existing user on first render after deploy with no UX onboarding, which on accounts without MVs converts every filter render into a stripped-WHERE facet scan.
    • Fix: Default the localStorage value to false, or ship a one-time announcement so users understand why the filter behavior changed.
  • packages/api/src/controllers/team.ts:110-115 -- When settings = {} both partitions are empty, update = {}, and Team.findByIdAndUpdate(teamId, {}, ...) is a silent no-op that still returns the document, so the route currently relies on a router-level early-return for empty-body safety.
    • Fix: Short-circuit at the controller level too, e.g. if (!Object.keys($set).length && !Object.keys($unset).length) return Team.findById(teamId);.
  • packages/app/src/components/DBSearchPageFilters.tsx:1332-1387 -- loadMoreFilterValuesForKey closes over useExactPipeline at call time, so a user who toggles Show All Values while a load-more is in flight can have the MV-derived values merged into extraFacets while the rest of the UI is rendering the exact pipeline.
    • Fix: Abort the in-flight request when useExactPipeline flips, or tag merged values with the pipeline that produced them and ignore mismatches.

Reviewers (9): correctness, testing, maintainability, project-standards, performance, api-contract, reliability, adversarial, kieran-typescript.

Testing gaps:

  • No backend test exercises the $set/$unset partition for null-only, mixed, undefined-only, and empty bodies.
  • No unit test covers getAllKeyValues fallback when the batched rollup throws or returns rows for only a subset of keys.
  • No test asserts useAllFieldsAndValues enabled-gating on fieldMetadataDisabled or cache invalidation when dateRange/maxKeys change.
  • No component test exercises the showAllValues toggle, useExactPipeline branch selection, or the where/filters stripping in all mode.
  • getMapKeys/getMapValues time-filter tests assert presence of a placeholder __TIME_FILTER__ token but do not verify the AND placement between value != ''/MetricName= and the time filter.
  • The e2e saved-search test dropped both openFilterGroup calls without asserting the new auto-expand behavior that replaced them.

@github-actions github-actions Bot added review/tier-4 Critical — deep review + domain expert sign-off and removed review/tier-3 Standard — full human review required labels May 14, 2026
@knudtty knudtty requested review from a team and teeohhem and removed request for a team May 15, 2026 15:39
@teeohhem
Copy link
Copy Markdown
Contributor

@knudtty looks to be a real E2E failure here

Comment thread packages/common-utils/src/types.ts Outdated
Comment thread packages/app/src/components/DBSearchPageFilters.tsx Outdated
Comment thread packages/common-utils/src/core/metadata.ts Outdated
Comment thread packages/common-utils/src/core/metadata.ts
Comment thread packages/common-utils/src/core/metadata.ts
Comment thread packages/common-utils/src/core/metadata.ts
Comment thread packages/app/src/api.ts Outdated
Comment thread packages/app/src/components/DBSearchPageFilters.tsx
@pulpdrew
Copy link
Copy Markdown
Contributor

packages/common-utils/src/core/metadata.ts:1507 -- getAllFieldsAndValues uses groupUniqArray(N)(Value), which returns N arbitrary distinct values rather than the top-N by frequency, so the default "Show All Values" filter panel surfaces random low-signal values instead of the most-common ones (the sibling getAllKeyValues path correctly uses ORDER BY sum(count) DESC LIMIT N BY ColumnIdentifier, Key).

I wonder if we could use topKWeighted to fix this

@knudtty
Copy link
Copy Markdown
Contributor Author

knudtty commented May 19, 2026

I wonder if we could use topKWeighted to fix this

I wasn't sure if we should surface most common values or not, I figured that can be an improvement later if we desire

@kodiakhq kodiakhq Bot merged commit a8eb27d into main May 20, 2026
18 checks passed
@kodiakhq kodiakhq Bot deleted the aaron/filters-mv-usage branch May 20, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants