Support chained comparisons with evaluate-once semantics#928
Support chained comparisons with evaluate-once semantics#928keyboardDrummer-bot wants to merge 8 commits intomainfrom
Conversation
Desugar Python chained comparisons (e.g., `1 < x < 10`) into pairwise comparisons chained with PAnd. For intermediate operands that are not simple names or literals, introduce temporary variables to preserve Python's evaluate-once semantics (e.g., `a < f() < b` evaluates `f()` only once). Closes #892
|
@keyboardDrummer-bot Try to get rid of the |
- Use proven [i] for comparators.val (size known via hComp) - Use getD for mutable arrays (compExprs, operandRefs, pairs) where Lean cannot track sizes through for-loop mutations - Merge the n>1 / n==1 branches into a single loop - All 155 tests pass, zero regressions
|
@olivier-aws Pushed a commit to remove all
|
| test_chained_compare.py(3, 4): ✅ pass - assert_assert(47)_calls_Any_to_bool_1 | ||
| test_chained_compare.py(3, 4): ✅ pass - chained compare | ||
| DETAIL: 5 passed, 0 failed, 0 inconclusive | ||
| RESULT: Analysis success |
There was a problem hiding this comment.
The test covers the basic 1 < x < 10 case (2-operator chain with simple operands). This exercises the PAnd chaining but not the temp variable path.
Consider adding tests for:
- Side-effecting intermediate operand — e.g.,
a < f() < bto verify the temp variable is generated and the function is called once - Triple chain — e.g.,
1 < x < y < 10to verify longer chains work - Mixed operators — e.g.,
a <= b < cto verify different operators in the same chain
The first case is the most important since it's the core motivation for the evaluate-once semantics. Without it, the temp variable codepath is untested.
There was a problem hiding this comment.
Added tests for cases 2 and 3:
test_chained_compare_triple.py—1 < x < y < 10(3-operator chain, 7 VCs pass)test_chained_compare_mixed.py—a <= b < c(mixed operators, 5 VCs pass)
For case 1 (side-effecting intermediate operand): I tried a < f() < b and a < x + 0 < b, but both hit the downstream "local variables in functions are not YET supported" limitation in the Laurel-to-Core translator — the LocalVariable temp binding that the desugaring emits isn't supported there yet. So the temp variable codepath can't be end-to-end tested until that limitation is lifted. I think that's acceptable for now since the desugaring logic itself is straightforward, but happy to add a pending test for it if you'd prefer.
|
Addressed @tautschnig's review in 37e737c:
Build passes, all 157 tests green (including the 2 new ones). |
|
Thanks for the review and approval @tautschnig! All feedback has been addressed. Ready to merge whenever @olivier-aws is happy with it. |
robin-aws
left a comment
There was a problem hiding this comment.
A few thoughts for later (especially since this is my first Strata review) but I'm just fine with this being merged as is.
| /-- Runtime assertion that a decidable proposition holds; throws an internal | ||
| error when it does not. Used to obtain array-size proofs that Lean cannot | ||
| infer through mutable `for`-loop state. -/ | ||
| private def guardProp {p : Prop} [Decidable p] (msg : String) |
There was a problem hiding this comment.
(Non blocking) This seems generically useful enough to move to Util.lean later instead
| have : i < compExprs.size := by omega | ||
| let comp := compExprs[i] | ||
| if n > 1 && i < n - 1 && !isSimple (comparators.val[i]) then | ||
| let freshVar := s!"cmp_tmp_{e.toAst.ann.start.byteIdx}_{i}" |
There was a problem hiding this comment.
For later, it would be incredibly valuable to actually prove that this pattern of fresh variable creation is actually sound and can't conflict with existing variables (shadowing may be fine)
There was a problem hiding this comment.
Two points where I think we can easily improve the testing coverage here. Not blocking but following up afterwards (and improving this more broadly) would be great.
- There's no actual test of the critical point of evaluating each operand only once. That wouldn't be complicated to do at the Python level by including
x < f() < yand definingf()to return different values if invoked multiple times. If that turns out to be impossible because the operands HAVE to be pure, then we should find another way to assertfis only evaluated once. Even if we only want to evaluating operands only once for performance reasons, we still have to test for that somehow. - In general, it looks like all of our test cases are only including
assertstatements that are expected to pass. IME this doesn't guard against several ways to accidentally break analysis such that it always succeeds. I would argue every test case should have at least one assert expected to fail as well, or else we have a pretty big blind spot in testing. (This is especially on my mind because with the test cases we have so far, I believe I could write a core interpreter that pass all of them by doing absolutely no actual execution :)
PR requested and agreed on by @olivier-aws
Summary
Implements Option B from the design discussion in #892: desugar Python chained comparisons with temporary variables for intermediate operands to preserve evaluate-once semantics.
What changed
Strata/Languages/Python/PythonToLaurel.lean— Rewrote theComparearm oftranslateExprto handle chains of any length:Compare(left, [op₁, ..., opₙ], [c₁, ..., cₙ])intoPAnd(left op₁ c₁, PAnd(c₁ op₂ c₂, ...)).LocalVariabletemp binding so the operand is evaluated exactly once. For example,a < f() < bbecomes:Blockwrapper, so there is no overhead for existing tests.Test changes:
test_chained_compare.pyfromtests/pending/totests/(now active).expected_laurel/test_chained_compare.expectedwith the verification output (all 5 checks pass).Verification
lake buildsucceeds.test_chained_comparetest (assert 1 < x < 10) passes with all verification conditions satisfied.Closes #892