-
Notifications
You must be signed in to change notification settings - Fork 4k
Open
Labels
A-sql-optimizerSQL logical planning and optimizations.SQL logical planning and optimizations.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsT-sql-queriesSQL Queries TeamSQL Queries Team
Description
When a filter on a non-numeric type (string, bytes, uuid, inet) intersects a histogram bucket, it is desirable to estimate a number of rows less than the total number in the bucket. Currently, the logic that handles this assumes a uniform distribution of data values across the first 8 bytes (ignoring any common prefix):
cockroach/pkg/sql/sem/tree/datumrange/range.go
Lines 180 to 182 in 3209e33
| case types.StringFamily, types.BytesFamily, types.UuidFamily, types.INetFamily: | |
| // For non-numeric types, convert the datums to encoded keys to | |
| // approximate the range. We utilize an array to reduce repetitive code. |
This likely works well for UUID columns, but can cause catastrophic underestimates for STRING columns, which are often clustered around certain values. A common example is when the STRING column represents a path. We should consider relaxing the uniformity assumption for non-UUID types.
Jira issue: CRDB-57841
Metadata
Metadata
Assignees
Labels
A-sql-optimizerSQL logical planning and optimizations.SQL logical planning and optimizations.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsT-sql-queriesSQL Queries TeamSQL Queries Team
Type
Projects
Status
Triage