Skip to content

Commit db3931e

Browse files
committed
Add D10: Universal Set dj.U semantics
- dj.U attributes are homologous to any namesake (bypass lineage check) - Valid: dj.U('a','b') & A (PK promotion), dj.U('a','b').aggr(A, ...) - Invalid: dj.U('a','b') - A (infinite set) - Deprecated: dj.U('a','b') * A (use & instead for PK manipulation)
1 parent eeb31f0 commit db3931e

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

docs/SPEC-semantic-matching.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,42 @@ Note: `A - B` is the negated form of restriction (equivalent to `A & ~B`), not a
110110

111111
**Note**: A warning may be raised for joins on unindexed attributes (performance consideration).
112112

113+
### Universal Set `dj.U`
114+
115+
`dj.U(attr1, ..., attrn)` represents the universal set of all possible values for the specified attributes. It has special semantics:
116+
117+
**Homology**: Attributes of `dj.U` are considered **homologous to any namesake attribute**. This is a special case where lineage matching is bypassed.
118+
119+
**Valid operations**:
120+
121+
| Expression | Meaning | Result PK |
122+
|------------|---------|-----------|
123+
| `dj.U('a', 'b') & A` | Promote a, b to PK | {a, b} |
124+
| `dj.U('a', 'b').aggr(A, ...)` | Aggregate A grouped by a, b | {a, b} |
125+
126+
**Constraint**: The attributes (a, b) must exist in the operand (A). Their lineage is transferred to the result.
127+
128+
**Invalid operations**:
129+
130+
| Expression | Reason |
131+
|------------|--------|
132+
| `dj.U('a', 'b') - A` | Would produce infinite set (all values NOT in A) |
133+
| `dj.U('a', 'b') * A` | Deprecated — use `dj.U('a', 'b') & A` instead |
134+
135+
**Deprecation**: Join on `dj.U` (using `*`) is deprecated. It was previously used only for PK manipulation, which is now done via restriction (`&`).
136+
137+
**Example**:
138+
```python
139+
# Group sessions by subject, counting sessions per subject
140+
dj.U('subject_id').aggr(Session, n="count(*)")
141+
# Result: (subject_id) -> n
142+
# subject_id lineage comes from Session.subject_id
143+
144+
# Promote subject_id to PK (removing session_id from PK)
145+
dj.U('subject_id') & Session
146+
# Result PK: {subject_id}
147+
```
148+
113149
### Primary Key Formation in Joins
114150

115151
The primary key of `A * B` is determined by functional dependency analysis, not simple union.
@@ -710,6 +746,22 @@ WHERE c.contype = 'f'
710746

711747
**Keep all rows**: The same constraint applies for `A.aggr(B, ..., keep_all_rows=True)`. A tuples with no matching B tuples appear with NULL aggregates, but the grouping constraint remains.
712748

749+
### D10: Universal Set `dj.U` Semantics
750+
751+
**Decision**: `dj.U` attributes are homologous to any namesake. Deprecate join (`*`) on `dj.U`.
752+
753+
**Homology rule**: Attributes of `dj.U` bypass lineage checking — they match any namesake attribute.
754+
755+
**Valid operations**:
756+
- `dj.U('a', 'b') & A` — promotes a, b to PK; lineage transferred from A
757+
- `dj.U('a', 'b').aggr(A, ...)` — aggregates A grouped by a, b
758+
759+
**Invalid operations**:
760+
- `dj.U('a', 'b') - A` — produces infinite set (error)
761+
- `dj.U('a', 'b') * A` — deprecated, use `&` instead
762+
763+
**Rationale**: Join on `dj.U` was only used for PK manipulation. Restriction (`&`) is clearer for this purpose.
764+
713765
## Testing Strategy
714766

715767
1. **Unit tests** for lineage propagation through all query operations
@@ -756,6 +808,7 @@ Semantic matching is a significant change to DataJoint's join semantics that imp
756808
| **D7**: Migration | Utility function + automatic fallback computation |
757809
| **D8**: PK formation | Functional dependency analysis; left operand wins ties; non-commutative |
758810
| **D9**: Aggregation | B must contain A's entire PK; result PK = PK(A); applies to `keep_all_rows=True` too |
811+
| **D10**: `dj.U` semantics | Homologous to any namesake; deprecate `*`, use `&` for PK promotion |
759812

760813
### Compatibility
761814

0 commit comments

Comments
 (0)