-
Notifications
You must be signed in to change notification settings - Fork 17
Add test and analysis for issue #396: Duplicate Python Values #399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Created test_issue_396_duplicate_pyobjects.py with test cases demonstrating expected behavior for PyObject deduplication - Added comprehensive analysis document (ISSUE_396_ANALYSIS.md) with: - Problem root cause analysis - 6 different solution approaches evaluated - Recommendation: Object Identity Cache (minimal changes, good performance) - Implementation plan and next steps Tests currently pass as cloudpickle is deterministic in simple cases. The issue may manifest in more complex scenarios with classes in __main__.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds comprehensive test coverage and analysis documentation for issue #396, which concerns duplicate Python object nodes in the e-graph when cloudpickle produces non-deterministic byte sequences for the same object.
Changes:
- Added test file with three test cases demonstrating the duplicate PyObject issue
- Created detailed analysis document evaluating 6 potential solutions with recommendation
- Tests are designed to detect cloudpickle determinism issues with classes, nested classes, and closures
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| python/tests/test_issue_396_duplicate_pyobjects.py | Adds three test cases that verify PyObject deduplication behavior for same-instance objects, nested classes, and closures |
| ISSUE_396_ANALYSIS.md | Comprehensive analysis document detailing the root cause, 6 potential solutions, recommendation for Object Identity Cache approach, and implementation plan |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| # Define a simple class in the test module (acts like __main__) | ||
| class MyClass: | ||
| def __init__(self, value) -> None: |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __init__ method should have a return type annotation of -> None for consistency with other test files in this codebase. This is a standard Python typing convention for constructors.
|
|
||
| class OuterClass: | ||
| class InnerClass: | ||
| def __init__(self, x) -> None: |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __init__ method should have a return type annotation of -> None for consistency with other test files in this codebase. This is a standard Python typing convention for constructors.
| pickled2 = cloudpickle.dumps(obj) | ||
|
|
||
| # Report on pickle determinism | ||
| print("\nCloudpickle determinism check:") |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string: This string literal does not contain any interpolation expressions and does not need to be an f-string.
| print(" ⚠ Cloudpickle produced DIFFERENT bytes for the same object!") | ||
| else: | ||
| print(" ✓ Cloudpickle is deterministic in this case") |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| print(" ✗ E-graph treats objects as DIFFERENT - duplicates detected!") | ||
| print(f" Error: {e}") |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| print("✗ Nested objects are NOT equal - duplicates detected!") | ||
| pytest.fail(f"Duplicate PyObject nodes for nested class: {e}") |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| except Exception as e: | ||
| print("✗ Closures are NOT equal - duplicates detected!") |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary f-string: These string literals do not contain any interpolation expressions and do not need to be f-strings.
| import cloudpickle | ||
| import pytest | ||
|
|
||
| from egglog import * |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import pollutes the enclosing namespace, as the imported module egglog does not define 'all'.
| from egglog import * | |
| from egglog import EGraph, PyObject, eq |
expected behavior for PyObject deduplication
Tests currently pass as cloudpickle is deterministic in simple cases.
The issue may manifest in more complex scenarios with classes in main.