Skip to content

Support for enum mapping for slots with multiple enum ranges#31

Closed
martinwellman wants to merge 9 commits intomainfrom
multi-enum
Closed

Support for enum mapping for slots with multiple enum ranges#31
martinwellman wants to merge 9 commits intomainfrom
multi-enum

Conversation

@martinwellman
Copy link
Copy Markdown
Collaborator

This PR adds support for mapping slots that have a range consisting of multiple enumerations.

Some important things to review:

  1. In object_transformer.py line 186, source_type (of type ElementName) is converted from a string with yaml.safe_load. The expected loaded value is a string (eg. 'myenum') or list of strings (eg. ['myenum1', 'myenum2', ...]). Is this the preferred way to do this?
  2. In object_transformer.py in function transform_enum: while iterating over the enum derivations, the first time we encounter a derivation with mirror_source=True we return str(source_value) (if no mapping has been successful yet). Should we instead iterate over all enum derivations, try to map the source_value (ignoring mirror_source), and only after failing all enum derivations should we return str(source_value) if any of the enum derivations have mirror_source=True?

Closes linkml/linkml#2128

@martinwellman
Copy link
Copy Markdown
Collaborator Author

Should add unit tests.

@martinwellman martinwellman marked this pull request as ready for review July 9, 2024 17:38
@amc-corey-cox
Copy link
Copy Markdown
Contributor

@martinwellman Thanks for working on this feature - supporting multiple enum ranges is a real need (linkml/linkml#2128).

I'm sorry to have taken so long to get back to you on this. After reviewing, I have some concerns about the approach:

Syntax Issue

The PR uses range: [enum1, enum2] which isn't standard LinkML syntax. LinkML coerces this to a string "['enum1',
'enum2']", which is why yaml.safe_load is needed to parse it back.

The proper LinkML way to specify multiple enum ranges is any_of:

slots:
my_slot:
range: Any
any_of:
- range: Enum1
- range: Enum2

With this syntax, you'd access the enums via slot.any_of[*].range directly - no string parsing needed.

Answers to your questions:

  1. Is yaml.safe_load the right way? - It works around the non-standard syntax, but with proper any_of support,
    you'd iterate slot.any_of directly.
  2. Should mirror_source=True return immediately? - Best practice would be to try all enum derivations first,
    then fall back to mirroring only if none match. The current implementation could skip valid mappings in later
    enums.

Recommendation

I'd like to see a reworking of this to support the standard any_of syntax instead. The feature is definitely needed, but
building on non-standard syntax creates a shaky foundation.

Would you be interested in updating the PR to use any_of, or would you prefer we take it from here?

@martinwellman
Copy link
Copy Markdown
Collaborator Author

@amc-corey-cox Thanks for the response. I realized eventually that I should be using "any_of" instead of having "range" equal to an array. I have a newer working version locally that uses "any_of" that I've been using for our project. The problem is that there are other parts of LinkML-Map that would need to be updated as well to handle multiple enumerations for ranges. For example, the code for determining the reverse transformation as well as for making a graph of the transformations (there might have been others as well, I can't remember). Would you want these other parts to be updated to support "any_of" with this PR as well?

There was also another problem I had to deal with related to "any_of", but I'll have to look back at my notes to recall what it was, since it's been a while.

@amc-corey-cox
Copy link
Copy Markdown
Contributor

@martinwellman Thanks for your response. We'd really appreciate if you share the work that you have. Ideally, we'd update the other parts as well but if this implementation doesn't break things maybe we can get by without that; or maybe we can add that if you don't have time. Since you have a working use-case and understand this better, I'd prefer to start with what you've got then to come up with it on my own. If you don't have time or interest for that, I can take it from here. Just let me know.

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Hey @martinwellman — thanks for the work on this and for the discussion. We're going to reimplement this feature cleanly using standard any_of syntax (as discussed in December) and tracked in #146.

A few notes on what's changed since this PR was opened:

  • eval_utils.py changes are superseded — the hand-rolled expression evaluator was replaced with simpleeval in 179015c. The Mapping support, recurse=False fix, and ast.Slice handling from this PR are all covered natively by the new evaluator, so the new implementation won't need any eval changes.
  • any_of syntax — we'll use slot.any_of[*].range to extract enum names directly, avoiding the yaml.safe_load round-trip on the stringified list.
  • transform_enum loop logic — your approach of iterating enum derivations in order is sound and we'll reuse that pattern.

We'll tag you for review on the new PR once it's up. Closing this one when that lands.

@amc-corey-cox
Copy link
Copy Markdown
Contributor

Closing in favor of #147, which reimplements this using standard any_of syntax. Thanks again @martinwellman for the groundwork — your review on the new PR would be much appreciated!

@amc-corey-cox amc-corey-cox deleted the multi-enum branch March 11, 2026 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mapping slots with multiple enumerations as a range

2 participants