Implement primary key rules for joins#1302
Merged
dimitri-yatsenko merged 2 commits intopre/v2.0from Jan 7, 2026
Merged
Conversation
Add functional dependency-based PK determination for joins: - A → B: PK = PK(A), A's attributes first - B → A (not A → B): PK = PK(B), B's attributes first - Neither: PK = union of both PKs Key changes: - Add Heading.determines() method to check A → B relationship - Update Heading.join() to apply PK rules based on functional dependencies - Add left join constraint requiring A → B (with allow_nullable_pk bypass) - Update Aggregation.create() to validate group → groupby requirement - Remove U.join() and rewrite U.aggr() to work without join - Add pk-rules-spec.md with semantic matching integration Tests: 509 passed (Python 3.12), 506 passed (Python 3.10) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
A.extend(B) is equivalent to A.join(B, left=True) but expresses clearer intent: extending an entity set with additional attributes rather than combining two entity sets. - Add extend() method to QueryExpression - Add 'extend' to supported_class_attrs for class-level access - Update pk-rules-spec.md to document extend as actual API - Add integration tests for valid and invalid extend cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements intelligent primary key determination for join operations based on functional dependencies between operands, building on the semantic matching foundation.
The Core Concept: Functional Dependencies
When joining two expressions A and B, the result's primary key depends on whether one operand determines the other:
A → B (A determines B): Every attribute in B's primary key exists in A.
This relationship tells us that knowing A's primary key is sufficient to identify B's entities through A's structure.
Primary Key Rules
Example: Session/Trial Pattern
Session * Trialhas PK ={session_id, trial_num}with Trial's attributes firstIntegration with Semantic Matching
PK determination is applied after semantic compatibility is verified:
assert_join_compatibility()ensures all namesakes are homologousSee the full specification:
docs/src/design/pk-rules-spec.mdLeft Join Constraint
Left joins now require A → B to ensure the result's PK can't have NULL values:
The
extend()OperationWhen A → B, a left join is conceptually not a join at all—it's closer to projection:
A.proj(..., new_attr=...))DataJoint provides an explicit
extend()method for this pattern:Example:
The
extend()method:DataJointErrorotherwise)allow_nullable_pk(that's an internal mechanism)Changes
Heading.determines()- Check if A → BHeading.join()- Apply PK rules based on functional dependenciesQueryExpression.join()- Add left join constraint withallow_nullable_pkbypassQueryExpression.extend()- Semantic alias forjoin(left=True)Aggregation.create()- Validate group → groupby requirementU.aggr()- Rewritten to work without join (U.join removed)Test plan
Heading.determines()(7 tests)extend()(valid and invalid cases)🤖 Generated with Claude Code