feat: Add setting to force enable/disable text index support for a source#2277
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: 4d1ece1 The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
PR Review
|
Deep Review✅ No critical issues found. The diff is additive (a new optional enum field on 🟡 P2 -- recommended
🔵 P3 nitpicks (7)
Reviewers (10): correctness, testing, maintainability, project-standards, performance, api-contract, adversarial, kieran-typescript, agent-native, learnings-researcher. Testing gaps:
|
E2E Test Results✅ All tests passed • 179 passed • 3 skipped • 1279s
Tests ran across 4 shards in parallel. |
583106b to
4d1ece1
Compare
4d1ece1 to
08d968c
Compare
| useTextIndexForImplicitColumn: | ||
| isLogSource(source) || isTraceSource(source) | ||
| ? source.useTextIndexForImplicitColumn | ||
| : undefined, |
There was a problem hiding this comment.
I understand why we have to pass around useTextIndexForImplicitColumn everywhere, but we really ought to come up with a more scalable solution.
Potential idea:
- Rendering functions have sources passed down to them
- For charts querying a raw table, we define rules for each of these settings and when to apply them. ex: apply
useTextIndexForImplicitColumnif FROM matchessource.from.tableNameand current WHERE node is - For MV optimizations we apply settings on a case by case basis, depending on the MV definition and if it makes sense.
There was a problem hiding this comment.
I agree! The same applies to timestampValueExpression and implicitColumnExpression, and probably some others like source filters in the EE repo. I think passing source down to the rendering functions is probably the way to go - I will ticket that up as a tech debt item if that's alright with you (since this change in particular is following the pattern established by implicitColumnExpression)
Summary
This PR adds a new source setting which allows for force-enabling or force-disabling text index support (generation of hasAllToken conditions for lucene implicit column search) for the source.
This is useful when
The default and recommended behaviour is still
auto- where the index is detected at query time.Screenshots or video
Merge engine based schema
With this schema, using the default (auto) setting will result in
hasToken(lower(Body), token)conditions which don't use the text index effectively due to thelower()function:When force-enabled, the same query will now use the correct hasAllToken syntax:
How to test on Vercel preview
This can be tested locally by creating a schema similar to the one provided above.
References