Skip to content

Commit 9a5442d

Browse files
committed
Clarify terminology: homologous, namesake, join rules
Definitions: - Homologous: same lineage (regardless of name) - Namesake: same name (regardless of lineage) - Homologous namesake: same name AND lineage → join attribute - Non-homologous namesake: same name, different lineage → error Join rules: - `*` operator always enforces semantic matching - `.join()` method provides kwargs for control, defaults to semantic match - `semantic_check=False` bypasses error (equivalent to `@`)
1 parent 5a177bb commit 9a5442d

File tree

1 file changed

+18
-17
lines changed

1 file changed

+18
-17
lines changed

docs/SPEC-semantic-matching.md

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,12 @@ With semantic matching: The `name` attributes have different lineages (one origi
3737

3838
## Key Concepts
3939

40-
### Homologous Attributes
40+
### Terminology
4141

42-
Two attributes are **homologous** if they:
43-
1. Have the same name
44-
2. Trace back to the same original attribute definition through foreign key chains
45-
46-
Homologous attributes are also called **semantically matched** attributes.
42+
- **Homologous attributes**: attributes with the same lineage (whether or not they have the same name)
43+
- **Namesake attributes**: attributes with the same name (whether or not they have the same lineage)
44+
- **Homologous namesakes**: attributes with the same name AND the same lineage — used for join matching
45+
- **Non-homologous namesakes**: attributes with the same name BUT different lineage — cause join errors
4746

4847
### Attribute Lineage
4948

@@ -65,21 +64,23 @@ Lineage propagates through:
6564
- **Foreign key references**: inherited attributes retain their origin lineage regardless of PK/secondary status
6665
- **Query expressions**: projections preserve lineage for renamed attributes; computed attributes have no lineage
6766

68-
### Join Compatibility Rules
67+
### Join Rules
6968

70-
For a join `A * B` to be valid, all namesake attributes must be homologous (same lineage).
69+
**The `*` operator** performs a semantic join:
70+
1. Joins on **homologous namesakes** (same name AND same lineage)
71+
2. Raises an error on **non-homologous namesakes** (same name, different lineage)
72+
3. Attributes with no namesake in the other operand pass through unchanged
7173

72-
**Cases**:
73-
1. **Both have lineage** → lineages must match (same origin)
74-
2. **Both have no lineage** → collision (both are native secondary attrs)
75-
3. **One has lineage, one doesn't** → collision (cannot be the same entity)
74+
**The `.join()` method** provides additional control via kwargs:
75+
- Defaults to semantic matching (same as `*`)
76+
- `semantic_check=False` bypasses the non-homologous namesake error (equivalent to `@` operator)
7677

77-
If namesake attributes are **not** homologous, an error should be raised.
78+
**Non-homologous namesake cases**:
79+
- Both have lineage but different origins → error
80+
- Both have no lineage (native secondary attrs) → error
81+
- One has lineage, other doesn't → error
7882

79-
**Implications**:
80-
- FK-inherited attributes (PK or secondary) can match if they share lineage
81-
- Native secondary attributes with the same name always collide - one must be renamed via `.proj()`
82-
- This replaces the old heuristic with a principled rule: lineage must match
83+
**Resolution**: Use `.proj()` to rename one of the colliding attributes.
8384

8485
**Note**: A warning may be raised for joins on unindexed attributes (performance consideration).
8586

0 commit comments

Comments
 (0)