Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
c854b5d
docs: add arithmetic-convention goals
FBumann May 21, 2026
1e336a4
docs: add convention.md placeholder
FBumann May 21, 2026
93193d7
docs: write the v1 arithmetic convention spec
FBumann May 21, 2026
5e40c56
docs: drop the narrow "arithmetic" framing from the spec
FBumann May 21, 2026
edb8edd
feat: add semantics option and convention test harness
FBumann May 22, 2026
1a3f2be
feat: v1 §5 and §8 on expression-OP-constant path
FBumann May 23, 2026
9b52ebd
feat: v1 §8 on the expr+expr / var+var merge path
FBumann May 23, 2026
e1a2390
feat: v1 §6 absence propagation through every operator
FBumann May 23, 2026
15b87e9
feat: Variable.reindex / .reindex_like (§4 absence creation)
FBumann May 23, 2026
4d87a05
fix: enforce v1 dead-term invariant in merge
FBumann May 23, 2026
602dbdd
feat: v1 §10 named-method join + §12 constraint RHS
FBumann May 23, 2026
317e57f
feat: make piecewise and SOS2 reformulation v1-aware
FBumann May 23, 2026
786d126
test: pin v1 §13 reductions skip absent
FBumann May 23, 2026
0ebccc1
feat: v1 §11 raises on auxiliary-coordinate conflicts
FBumann May 23, 2026
c7e9099
test: pin aux-coord propagation guarantees (§11)
FBumann May 23, 2026
e2cb24f
test: pin v1 dead-term invariant, == constraints, §11 ops, end-to-end…
FBumann May 23, 2026
e8d7b6b
refactor: extract v1 semantics helpers into linopy/semantics.py
FBumann May 23, 2026
6093f66
test: parameterize the three operator-uniform v1 test groups
FBumann May 23, 2026
f0bced4
refactor: split _add_constant / _apply_constant_op for clean 1.0 removal
FBumann May 23, 2026
4a6e8e4
fix(ci): defer linopy import in conftest + add missing type annotations
FBumann May 23, 2026
4dc67bf
feat: self-describing v1 error messages
FBumann May 23, 2026
56ad5bf
test: round out v1 coverage gaps + fix Variable.unstack absence sentinel
FBumann May 23, 2026
e21b144
test: pin upstream catches for objective and constraint-LHS NaN
FBumann May 23, 2026
9455683
feat: site-specific, actionable legacy warnings (goal #2)
FBumann May 24, 2026
7a44e62
feat: full-text warning assertions + stdlib stacklevel + docs-plan
FBumann May 24, 2026
72ddd99
fix: §11 aux-coord check fires on every join, not just join=None
FBumann May 24, 2026
a38a50d
fix: §6 absence propagates through quadratic factor product
FBumann May 24, 2026
71e0fd2
test: parametrize §6 quadratic propagation across entry points
FBumann May 24, 2026
df34c72
fix: §5 user-NaN check on Variable.to_linexpr(coefficient)
FBumann May 24, 2026
1c77a74
fix: §10 join='override' rejects shared-dim size mismatch
FBumann May 24, 2026
a4c7f06
fix: split legacy RHS warnings — coord-mismatch vs user-NaN
FBumann May 24, 2026
578c6f4
fix: structural §8 pre-check, drop brittle xarray exception parse
FBumann May 24, 2026
7498cc3
docs: lock §5 — user NaN raises, close #627 alternative
FBumann May 24, 2026
1517a11
perf: hoist semantics imports out of Variable.to_linexpr hot path
FBumann May 24, 2026
79b89b1
refactor: thread op_kind explicitly through _apply_constant_op
FBumann May 24, 2026
99c87cf
fix: §11 aux-coord — document asymmetric presence, split shape vs val…
FBumann May 24, 2026
32edfaf
fix(types): sort with key=str so override gate is Hashable-safe
FBumann May 24, 2026
d32be12
docs: trim docs-plan to an early-stage outline
FBumann May 24, 2026
b6d38bf
refactor: dedupe v1-semantics helpers
FBumann May 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
197 changes: 197 additions & 0 deletions arithmetics-design/convention.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# The v1 convention

The strict ("v1") convention for linopy. Goals and rollout plan:
[`goals.md`](goals.md). The bugs it fixes are catalogued in [#714].

Thirteen sections in three groups: absence (§1–§7), coordinate alignment
(§8–§11), then constraints and reductions (§12–§13).

## Absence

Absence — a labelled slot the model does not cover — is the richer half of the
convention. The sections below say what it is (§1–§3), how it arises (§4–§5),
and how it flows through arithmetic and is resolved (§6–§7).

### §1. Absence is a first-class state

A *slot* — one labelled position — is either present or *absent*. An absent
slot is one the model does not cover. Absence is a state in its own right,
never a stand-in for a number: an absent variable is not a variable fixed to
zero ([#712]).

### §2. Encoding absence

The *marker* is how an absent slot is stored: `NaN` in floating-point fields
(`coeffs`, `const`, numeric constants), and `-1` in integer label fields (a
variable's `labels`, an expression's `vars`, which cannot hold a NaN). The two
encodings are one concept — an absent slot, whatever the dtype.

Within a single slot, the markers move together: `const.isnull()` at a slot
implies *every* term at that slot has `coeffs = NaN` and `vars = -1`. Operators
that introduce absence at a slot also absorb any live terms there, so the
storage never carries a half-absent row. A term at a *present* slot may still
carry `vars = -1` after `fillna(value)` revives the slot — that's a *dead
term*, inert at the solver layer, and only meaningful as storage book-keeping.

### §3. Testing absence

`isnull()` is the one predicate for absence. It reads the marker — `NaN` or
`-1`, whichever the field uses — and reports absence slot by slot. Every rule
that speaks of an "absent slot" means exactly what `isnull()` reports; the
caller never inspects the raw marker.

### §4. Creating absence

Absence enters a model only through named operations: `mask=` at construction
marks slots absent up front; `.where(cond)` masks slots in place, keeping
shape; `.reindex()`, `.reindex_like()`, `.shift()`, and `.unstack()`
restructure a coordinate and leave the new positions absent. Operations that
merely move or select existing data — `.roll()`, `.sel()`, `.isel()` — never
introduce it.

### §5. User-supplied NaN raises

A NaN in a user-supplied constant raises `ValueError`. linopy trusts NaN only
from its own structural operations (§4), which genuinely mark absence. A NaN in
user data is ambiguous — a deliberate "absent", or a data error — so linopy
refuses to guess and asks the caller to resolve it with `fillna()`. This
replaces today's silent per-operator fills, which guessed a different value for
every operator ([#713]). To mark slots absent, use the mechanisms of §4 — a
bare NaN in a constant is not one of them.

The alternative — reading user NaN as "absent" instead of raising — was
discussed in [#627] and closed: ambiguous overload of a numeric value
defeats goal #1, since a data-error NaN is silently re-labelled as
intentional absence.

### §6. Absence propagates through every operator

Every operator carries absence through unchanged: a slot absent in any operand
is absent in the result. `shifted * 3` is absent; `shifted + 5` is absent;
`x + shifted` is absent wherever `shifted` is — even though `x` itself is fine
there.

linopy never fills an absent slot on the user's behalf, because the right fill
depends on intent it cannot see: 0 for a sum, 1 for a product, or "leave this
out" entirely. Because every operator propagates the same way, the algebraic
laws of §10 carry over to absent slots untouched — absence absorbs, so every
grouping of an expression agrees. And `shifted * 3` staying absent, rather than
collapsing to `0`, is what preserves the absent-vs-zero distinction of §1.

### §7. Resolving absence

Because §6 never fills, turning an absent slot into a value is the caller's
explicit act, never linopy's. `fillna(value)` fills an expression's absent
slots; `.fillna(...)` fills a constant before it enters the arithmetic;
`fill_value=` on a named method fills as part of the call. Filling at the call
site documents the intent: `x + y.shift(time=1).fillna(0)` says "treat the
missing earlier step as zero" exactly where it matters.

## Coordinate alignment

linopy's operands are xarray objects, so the convention starts from xarray's
alignment model (goal 4): coordinates align by *label*, never by position;
non-shared dimensions broadcast; a mismatch on a shared dimension is resolved
by an explicit *join*.

**Open question:** how should v1 align *unlabeled* data — a raw numpy array
carries no labels to match on. Still open.

### §8. Shared dimensions must match exactly

If two operands share a dimension, their coordinate labels must be identical,
or the operator raises `ValueError`.

This is xarray's model with `arithmetic_join="exact"` — deliberately stricter
than xarray's own default (`inner`). An inner join silently drops the
non-overlapping labels, and in an optimization model a dropped coordinate is a
dropped term or constraint: a silent wrong answer. An exact match surfaces the
mismatch where it happens. (The [pyoframe] library uses the same model.)

Because the rule is identical for every operator, the operator-alignment split
([#708]) — `*` aligning by label while `+`, `-`, `/` go by position —
disappears.

### §9. Non-shared dimensions broadcast freely

A dimension present in only one operand broadcasts over the other, with no
restriction — for both expressions and constants. Only *shared* dimensions are
subject to §8.

### §10. Mismatches resolve via an explicit join

When coordinates genuinely differ, §8 raises — and the caller says how to
resolve it. Several primitives bring operands into agreement:

- `.sel()` / `.isel()` cut operands down to a shared subset — often the
clearest fix.
- The named methods — `.add` `.sub` `.mul` `.div` `.le` `.ge` `.eq` — take a
`join=` argument: `exact`, `inner`, `outer`, `left`, `right`, or `override`.
`override` is the old positional behavior — still available, but now opt-in
and named rather than triggered by a size coincidence.
- `.reindex()` / `.reindex_like()` conform an operand to a target index
(extending past the original creates absent positions — §4).
- `.assign_coords()` relabels an operand outright (positional alignment, made
explicit).
- `linopy.align()` pre-aligns several operands at once.

Because no operator silently drops coordinates, the associativity break
([#711]) cannot occur: the operation that used to drop coordinates now raises.
Every standard algebraic law — commutativity, associativity, distributivity,
the identities — holds for same-coordinate operands.

### §11. Auxiliary-coordinate conflicts raise

Auxiliary (non-dimension) coordinates are user-attached metadata: a coord
defined on some dimension but not itself a dimension, like a `B(A)` group
label on dimension `A`. linopy *validates* them (the conflict-raise rule
below) and *propagates* them through arithmetic unchanged, but never
*computes* with them — they describe the data, they don't enter the math.

When two operands carry an aux coord with the same name and values agree,
the coord propagates to the result. When only one operand carries the
coord, it propagates from that operand unchanged — asymmetric presence is
not a conflict. When the values *do* disagree (same name on both sides,
different values), the operator raises — `xarray` silently drops the
conflict, which is the [#295] bug. The caller resolves it explicitly with
`.drop_vars(name)` (remove the coord) or `.assign_coords(name=...)`
(relabel one side).

## Constraints and reductions

Two kinds of operation build on the rules above without being binary operators:
the comparisons that form constraints, and the reductions that collapse a
dimension.

### §12. Constraints follow the same rules

A constraint is built by comparing two sides with `<=`, `>=`, or `==` — and a
comparison is an operator like any other. It aligns its sides by §8 and carries
absence by §6, exactly as `+`, `-`, `*`, and `/` do. So algebraically equal
forms build the same constraint: `x - a <= 0` and `x <= a` agree, where today
they do not ([#707]).

Each slot becomes one constraint row. An absent slot yields no row — absence
propagated into a comparison drops the constraint there, the same outcome as
masking it.

### §13. Reductions skip absent slots

Reductions — `sum`, `mean`, and the `groupby` / `resample` / `coarsen`
aggregations — collapse a dimension rather than combining two operands, so the
propagation of §6 does not apply: they *skip* absent slots instead. `sum` adds
the present terms, and the sum of none is the zero expression. `mean` divides
by the count of *present* slots, not all of them — dividing by all would treat
an absent slot as a zero term, which §1 forbids. The objective totals its
terms the way `sum` does.

<!-- references -->
[pyoframe]: https://github.com/Bravos-Power/pyoframe
[#714]: https://github.com/PyPSA/linopy/issues/714
[#713]: https://github.com/PyPSA/linopy/issues/713
[#712]: https://github.com/PyPSA/linopy/issues/712
[#711]: https://github.com/PyPSA/linopy/issues/711
[#708]: https://github.com/PyPSA/linopy/issues/708
[#707]: https://github.com/PyPSA/linopy/issues/707
[#627]: https://github.com/PyPSA/linopy/issues/627
[#295]: https://github.com/PyPSA/linopy/issues/295
32 changes: 32 additions & 0 deletions arithmetics-design/docs-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Docs plan — user-facing migration guide

Early-stage outline for the v1 migration docs. Not the guide itself —
the three pieces it needs to cover, written when someone picks it up.

## Three audiences, one migration

- **Downstream library maintainers** (PyPSA, pypsa-eur, calliope, …) —
carry the bulk of the migration work: opt their codebases into v1,
fix the raises, ship a release that no longer warns under legacy.
- **Direct users of linopy** — write linopy code themselves and need
to know what changes for their own call sites.
- **End users of downstream libraries** — never touch linopy directly,
but may see a `LinopySemanticsWarning` in CI logs and need a
pointer to "this is upstream; your maintainer will handle it".

## Three things to cover

1. **Why v1 exists.** One paragraph: legacy silently mishandled NaN,
coord mismatches, and absent variables. The bug catalogue in #714
has the case-by-case detail.

2. **What's changing and when.** The rollout timeline:
- v1 ships opt-in via `linopy.options['semantics'] = 'v1'`.
- v1 becomes the default in a later minor release (date TBD).
- Legacy removed at 1.0.

3. **How to migrate.** What downstream maintainers do to flip their
codebase: opt in on a branch, run tests, fix the raises. The
legacy warning text already names the rule and the fix per site,
so the guide is mostly the high-level recipe plus a pointer to
the spec (`arithmetics-design/convention.md`) for the rule list.
47 changes: 47 additions & 0 deletions arithmetics-design/goals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# The v1 convention — design & transitioning goals

Goals for linopy's strict ("v1") convention. The bugs that motivate
it are catalogued in [#714]; the convention itself is in
[`convention.md`](convention.md).

## Design goals

The convention serves four goals, in priority order:

1. **No silent wrong answers.** Every bug in the catalogue ([#714]) returns a
plausible result with no error. The overriding goal: a mismatch linopy
cannot resolve unambiguously must raise, not get guessed. Where the library
cannot decide, the caller does — with an explicit join, `.sel()`, or
`fill_value=`.
2. **Preserve the algebraic laws.** Commutativity, associativity,
distributivity, the identities. Optimization code builds expressions by
rearranging terms, and the convention must keep that safe.
3. **Absence is first-class.** A variable can be genuinely absent at a slot —
masked out, or shifted past the edge. The data model needs an explicit
marker for that absence, kept distinct from a zero term, so absent-vs-zero
is never a silent guess.
4. **Least surprise.** linopy is built on xarray and its users know xarray. The
convention should behave the way xarray already taught them — align by
label, broadcast non-shared dimensions, resolve mismatches with a named
join — not invent linopy-specific rules. Auxiliary coordinates the user
attached are the user's; linopy validates and carries them through,
never silently dropped or rewritten.

## Transitioning goals

1. **Non-breaking.** Existing code keeps working — legacy stays available and
unchanged until it is removed at linopy 1.0.
2. **Actionable warnings.** Warn every legacy user about behaviour changes —
what changes under v1, and how to fix it — aiming for 100% coverage.
3. **No silent change.** Opting into v1 never silently changes a model — every
difference is either raised, or was warned about in legacy mode.

**Schedule:**

1. Introduce v1 as opt-in — warn about behaviour changes on legacy, raise if
opted into v1.
2. Make v1 the default, allow opt-out.
3. linopy 1.0 — drop the legacy convention entirely.

<!-- references -->
[#714]: https://github.com/PyPSA/linopy/issues/714
Loading
Loading