Do not define total ordering for Datum by sgrif · Pull Request #1041 · pgdogdev/pgdog

sgrif · 2026-06-08T15:23:18Z

Datum has had total ordering defined since it was originally introduced in 000b4cd. There's no context given for why, but I'm willing to assume the answer is "because it could be derived". At one point the implementation moved from derived to manually defined with identical semantics (in a commit written by claude that has no reasoning in the commit message, so I assume wasn't thought out).

The reason this is possible is because PG does define total ordering for each of its data types. The only data type for which this isn't obviously the case is floats, and PG defines all NaNs as equal and greater than non NaNs. Rust then also defines what ordering for differing enum variants means in its PartialOrd derive, which primarily orders on discriminant.

In both PartialEq and PartialOrd, we diverge from PG's behavior in that we do not ever consider cross type comparisons, while PG does have several opclasses that do not require the lhs and rhs to be the same type. This is probably fine, as I believe the only we could ever end up comparing Datums of differing types expecting a meaningful answer are:

The datum variant of a column changes across rows (impossible)
The datum variant of a column changes across shards (should be impossible)
We are blindly comparing datums from two columns that may have differing types or against a hard coded value and didn't consider this (unlikely, but possible)

The last case is a bit of a footgun, but one that I'm comfortable enough with to not immediately go have us try to perfectly match PG's semantics with. 1::int4 != 1.0::real is not necessarily ideal, but is at least a logical answer and one that is very hard for us to reach in our code.

Leaving PartialOrd alone, however, makes me much less comfortable. The semantics of the derived implementation would mean 1::int4 > 2.0::real, which is much more obviously wrong. And were that footgun to ever go off, it's feasible that it wouldn't be caught by a test, unless the code comparing datums considers every possible way lhs and rhs could differ in type.

Because of that, I have manually implemented PartialOrd to explicitly return None when the types differ, along with any comparison with Null returning None to reflect that PG returns NULL in that case. I opted not to have this reflect the behavior of PG's ORDER BY, which is NULLS FIRST by default, as that code handles NULLs explicitly, and I would expect the behavior of PartialOrd to match the < operator in SQL.

The PartialOrd impl was written in this more verbose way, rather than with a single _ if discriminant(self) != discriminant(other) so that any variants added to Datum in the future will fail to compile if they are not handled in the impl. (Hilariously, this meant I couldn't write (Null, _) | (_, Null) in the last arm to match the shape of the others, as the compiler correclty points out that (_, Null) is redundant as we've already handled every other possible type on the left explicitly)

I have left the Eq impl alone since returning false is a reasonable answer for differing types, and the only requirement that Rust defines for Eq is that a == a which is the case.

Datum has had total ordering defined since it was originally introduced in 000b4cd. There's no context given for why, but I'm willing to assume the answer is "because it could be derived". At one point the implementation moved from derived to manually defined with identical semantics (in a commit written by claude that has no reasoning in the commit message, so I assume wasn't thought out). The reason this is possible is because PG does define total ordering for each of its data types. The only data type for which this isn't obviously the case is floats, and PG defines all NaNs as equal and greater than non NaNs. Rust then also defines what ordering for differing enum variants means in its `PartialOrd` derive, which primarily orders on discriminant. In both `PartialEq` and `PartialOrd`, we diverge from PG's behavior in that we do not ever consider cross type comparisons, while PG does have several opclasses that do not require the lhs and rhs to be the same type. This is *probably* fine, as I believe the only we could ever end up comparing `Datum`s of differing types expecting a meaningful answer are: - The datum variant of a column changes across rows (impossible) - The datum variant of a column changes across shards (should be impossible) - We are blindly comparing datums from two columns that may have differing types or against a hard coded value and didn't consider this (unlikely, but possible) The last case is a bit of a footgun, but one that I'm comfortable enough with to not immediately go have us try to perfectly match PG's semantics with. `1::int4 != 1.0::real` is not necessarily ideal, but is at least a logical answer and one that is very hard for us to reach in our code. Leaving `PartialOrd` alone, however, makes me much less comfortable. The semantics of the derived implementation would mean `1::int4 > 2.0::real`, which is much more obviously wrong. And were that footgun to ever go off, it's feasible that it wouldn't be caught by a test, unless the code comparing datums considers every possible way lhs and rhs could differ in type. Because of that, I have manually implemented `PartialOrd` to explicitly return `None` when the types differ, along with any comparison with `Null` returning `None` to reflect that PG returns `NULL` in that case. I opted not to have this reflect the behavior of PG's `ORDER BY`, which is `NULLS FIRST` by default, as that code handles NULLs explicitly, and I would expect the behavior of `PartialOrd` to match the `<` operator in SQL. The `PartialOrd` impl was written in this more verbose way, rather than with a single `_ if discriminant(self) != discriminant(other)` so that any variants added to `Datum` in the future will fail to compile if they are not handled in the impl. (Hilariously, this meant I couldn't write `(Null, _) | (_, Null)` in the last arm to match the shape of the others, as the compiler correclty points out that `(_, Null)` is redundant as we've already handled every other possible type on the left explicitly) I have left the `Eq` impl alone since returning `false` is a reasonable answer for differing types, and the only requirement that Rust defines for `Eq` is that `a == a` which is the case.

levkk · 2026-06-08T15:31:06Z

It was probably to support sorting (ORDER BY) in cross-shard SELECT queries.

codecov · 2026-06-08T15:32:00Z

Codecov Report

❌ Patch coverage is 34.84848% with 43 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pgdog-postgres-types/src/datum.rs	19.44%	29 Missing ⚠️
pgdog/src/backend/pool/connection/buffer.rs	53.33%	14 Missing ⚠️

📢 Thoughts on this report? Let us know!

sgrif · 2026-06-08T15:36:08Z

It was probably to support sorting (ORDER BY) in cross-shard SELECT queries.

Yup, I mention that in the commit message. (The test failure shows I do need to add some more explicit handling to the ordering code before this is ready to merge though)

sgrif · 2026-06-08T15:57:34Z

Well this is fun, I cannot reproduce the python failure outside of the python test. And I can see that the full matrix of ASC, ASC NULLS FIRST, DESC, and DESC NULLS LAST work in the other test cases 🤪

levkk · 2026-06-08T16:00:53Z

Might be due to encoding? Python tests might be using asyncpg which uses binary. Most of our other tests use text I think

I reversed the order in that test because it seems like it cares about how we compare NULL with certain timestamps, and not that we support NULLS FIRST|LAST specifically

sgrif · 2026-06-08T16:57:27Z

For posterity, the python test was correct and the other test I was messing with was trying to decode timestamptz into NaiveDateTime which sqlx doesn't support, and did it in such a way that it converted all decoding errors into None so it would pass regardless of the ordering

sgrif · 2026-06-08T18:06:38Z

+                                // FIXME(sage): We don't handle ASC NULLS FIRST or
+                                // DESC NULLS LAST we should either error or add
+                                // support rather than silently do the wrong sorting
+                                match (&left.value, &right.value, asc) {


Indentation change made this look like more churn than I would have hoped. Only meaningful change in behavior is this match compared to line 109 in the old code

meskill · 2026-06-08T20:48:16Z

I'd like to bring the same point as for the Add impl pr - the usage of default traits like Add, PartialOrd, PartialEq is confusing for Datum. The default traits should specify generic behavior that's, well, applicable in most cases. But as @sgrif mentioned the current implementation quite specialized and could go off if used somewhere else (and because the default traits are convenient it's easy to run into it).
I'd go the following way: specialized traits for the required operations inside aggregate.rs module instead of std and implement it the way we need for aggregation, drop the std entirely until we'll need them. Or, if we stay with std, please, comment the reasoning and the use cases for the defined impls.

meskill · 2026-06-08T20:52:03Z


 /// GROUP BY <columns>
-#[derive(Hash, PartialEq, Eq, PartialOrd, Ord, Debug)]
+#[derive(Hash, PartialEq, Eq, PartialOrd, Debug)]


I don't think we need the PartialOrd here. And it's not clear how it should work actually considering that we store an index and the value here

Good catch, thought I deleted that

meskill · 2026-06-08T20:57:59Z

+#[derive(Debug, Clone, PartialOrd, PartialEq, Eq, Hash)]
 pub struct Array {
    elements: Vec<Option<Datum>>,
    element_oid: i32,
    dim: Dimension,
 }


the automatic derive for PartialOrd is questionable here since it will compare all the fields of structure in order and the Option has this None < Some semantic.

oh, wait, what does Option mean here? is it another representation for Datum::Null?

You're right that option is sus. I will look into it

sgrif · 2026-06-08T21:35:31Z

Or, if we stay with std, please, comment the reasoning and the use cases for the defined impls.

Is there an impl that I didn't lay out the reasoning for in this commit? Or do you mean literally a code comment? I prefer to leave "why" out of code comments as that gets out of sync with the code very quickly, while commit messages are tied to the code at a specific point in time and are a git blame away

I'd go the following way: specialized traits for the required operations inside aggregate.rs module

I strongly disagree in this case. Add is much more clearly only something aggregation cares about. Equality and comparison are much more generally meaningful and losing access to sorting functions and data structures from the broader ecosystem is a huge loss. I feel like I laid out my reasoning about why this impl makes sense at the datum level pretty thoroughly here. Is there something more specific you disagree with?

sgrif requested a review from meskill June 8, 2026 15:23

sgrif mentioned this pull request Jun 8, 2026

Change Datum to only perform checked addition #1039

Open

This comment has been minimized.

Sign in to view

Handle NULLS in ORDER BY, fix a test that was testing nothing

5b61a26

I reversed the order in that test because it seems like it cares about how we compare NULL with certain timestamps, and not that we support NULLS FIRST|LAST specifically

sgrif commented Jun 8, 2026

View reviewed changes

meskill reviewed Jun 8, 2026

View reviewed changes

Conversation

sgrif commented Jun 8, 2026

Uh oh!

This comment has been minimized.

levkk commented Jun 8, 2026

Uh oh!

codecov Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sgrif commented Jun 8, 2026

Uh oh!

sgrif commented Jun 8, 2026

Uh oh!

levkk commented Jun 8, 2026

Uh oh!

sgrif commented Jun 8, 2026

Uh oh!

sgrif Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

meskill commented Jun 8, 2026

Uh oh!

meskill Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

sgrif Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

meskill Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

sgrif Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

sgrif commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Jun 8, 2026 •

edited

Loading