fix(udf): l2_distance returns L2sq to match USearch and DuckDB#8
fix(udf): l2_distance returns L2sq to match USearch and DuckDB#8anoop-narang merged 1 commit intomainfrom
Conversation
1105494 to
8c30d27
Compare
8c30d27 to
b78963f
Compare
| .map(|(x, y)| (x - y) * (x - y)) | ||
| .sum::<f32>() | ||
| .sqrt() | ||
| } |
There was a problem hiding this comment.
P2 (non-blocking): No comment here explaining that the sqrt omission is intentional. A future maintainer may add .sqrt() back thinking it's a missing step, silently reintroducing the inconsistency. Consider:
| } | |
| // Returns L2sq (no sqrt) — matches USearch MetricKind::L2sq and keeps numeric | |
| // values consistent between the UDF path and the optimizer-rewritten index path. | |
| } |
There was a problem hiding this comment.
Review
Issues
P1 — No numeric regression test (tests/execution.rs)
The entire motivation for this PR is fixing a numeric inconsistency (UDF returned L2, index returned L2sq). However, the test suite only asserts on ordering, not on actual distance values. This means the fix has zero regression coverage: if someone adds .sqrt() back to l2_kernel, every test will still pass.
A minimal test would cover the gap — e.g., using a non-trivial query where L2 ≠ L2sq:
// [3,4,0,0] vs [0,0,0,0] → L2sq = 25.0, L2 = 5.0
// Assert the returned value is 25.0, not 5.0
SELECT l2_distance(vector, ARRAY[3.0::float, 4.0::float, 0.0::float, 0.0::float]) AS d
FROM items WHERE id = 3Without this, the consistency guarantee introduced by the fix is not tested.
Action Required
Add at least one test that asserts a specific numeric value for l2_distance so the squared-L2 semantics are regression-protected.
l2_distance UDF was computing actual L2 (with sqrt) while USearch and the rewritten execution paths all use L2sq (no sqrt). This caused the same query to return different numeric distance values depending on whether the optimizer rewrote it. Remove sqrt to match USearch's MetricKind::L2sq and DuckDB VSS's array_distance behavior. All paths now return consistent L2sq values.
b78963f to
04cfd4d
Compare
Summary
.sqrt()froml2_kernelsol2_distanceUDF returns squared L2, matching USearch'sMetricKind::L2sqand DuckDB VSS'sarray_distanceTest plan
cargo fmt --checkcargo clippy -- -D warningscargo test— all 32 tests pass (no tests assert on specific distance values, only ordering)