Add a 'map<K,V>' type#554
Conversation
| forcibly prevent duplicate keys from appearing in the list. In the case of | ||
| duplicate keys, the expectation for bindings generators is that for any given | ||
| key, the *last* (key, value) pair in the list defines the value of the key in | ||
| the map. To simplify bindings generation, `<keytype>`s is a conservative subset |
There was a problem hiding this comment.
the expectation for bindings generators
Can we find a way to make this language stronger? Imagine some "authorization middleware" component sits on an interface and inspects a call with a map param. If the value {"action": "view", "action": "delete"} comes through it is going to be very important that the middleware treats the map exactly the same as the next component.
There was a problem hiding this comment.
Yeah, I suppose "expectation" is too weak; would it make sense to say "Although the Component Model cannot enforce this property, bindings generators MUST ..."?
There was a problem hiding this comment.
I see two things about the current map specification that can lead to problems.
First, allowing duplicate keys and expect bindings to just ignore the last occurrence. I think uniqueness of keys should be mandated by the spec and enforced at the boundary.
The second is that the order of entries is propagated, but some languages’ (standard) Map type(s) will not preserve this. In my opinion, it should be expected that bindings will map map to a type that retains order. It would then also be a good idea to rename to something else (e.g. dict, ordered-map) so there is less chance of confusion with a conventional (non order preserving) Map type.
There was a problem hiding this comment.
Yeah, it's a tradeoff to be sure. However, if map forces the runtime to build a temporary hash set (to enforce uniqueness) that is pure overhead (and potentially a pretty non-trivial runtime-internal memory allocation, which we otherwise avoid in the CABI), interface designers will have to ask whether they can "afford" to use map or whether they should use list<tuple<K,V>> instead for performance reasons, which seems net worse.
There was a problem hiding this comment.
-
I have no issue with mandating uniqueness in the spec. It's always easier to relax constraints than to tighten them, anyway, so might as well be more constrained up front. However, when that constraint is violated by a misbehaving lifting component, I think the behavior should be the same for the lowering component; just ignore all but the final value. I don't think we should do the C thing and call it undefined behavior (unless that's normal for the component model?) and I don't think the lowering should trap. So this is just a question of semantics rather than behavior.
-
I don't think the basic map type should be ordered. Most languages do not use an ordered basic map type because 9 times out of 10 you don't need an ordering for your maps. In the cases where you do need an ordered map, you could always fall back in
list<tuple<key, value>>. I don't think it's super important to create a specialization for an ordered map as well, but that could be revisited later.
There was a problem hiding this comment.
Does that sound right?
Pretty much, yeah. 👍
There was a problem hiding this comment.
@lann Since the deterministic profile can't randomly permute, if we don't normalize order in the deterministic profile, then that effectively makes order an observable part of the semantics of map values. But yeah, I suppose in some cases the performance might be a problem, even for the deterministic profile, so it's a tradeoff worth discussing.
There was a problem hiding this comment.
Ah I see, you're talking about baking unorderedness (and presumably dedupedness?) into the map lowering semantics while I was only thinking of making the bindings generation guidelines language stronger.
I see the benefit of formalizing it but yeah, requiring sorting on every map lowering seems quite a different order of tradeoff than NaN canonicalization. 🙂
Do you have any references on high-level motivations / use-cases for deterministic profiles? It seems difficult to evaluate this kind of tradeoff without that context.
There was a problem hiding this comment.
The use case for the deterministic profile (introduced in the 3.0 draft here as part of the relaxed SIMD instructions) is just to define, if you want to run wasm deterministically, here's how to do it. If we don't specify sort+dedupe, that means that, even when running deterministically, a component can produce different outputs for the input {a:1, b:2} vs {b:2, a:1} which means that these two values must be considered unequal if you're, e.g., caching outputs keyed by inputs. That's a corollary, but I don't know how much of a problem it is.
There was a problem hiding this comment.
Finally getting back to this: this commit tries to clarify the wording about what bindings generators are allowed/required to do, speaking to Lann's original concern.
I also played with the idea above of nondeterministically reordering/deduping the list-of-key-value-pairs in the lifting/lowering rules. However, it seemed like more trouble than it was worth and forcing the deterministic profile to always sort/dedupe seemed like overkill, so for now I just left it as is.
Resolves #125
|
A bunch of great implementation and bindings generator work has happened in the meantime (nice work @yordis!), and since this features is properly gated, it seems reasonable to merge this PR soon (if there's no further feedback on the open thread above). As always, we can iterate in new issues/PRs as |
This PR adds a
map<K,V>to the set of Component Model and WIT value types as discussed in #125, specifically based on this design by @yordis. With the approach of treatingmap<K,V>as just a specialization oflist<tuple<K,V>>, the addition is pretty small and most of the work will be in the bindings generators. I'm not in a particular rush to add this, but given the interest in #125, it seems useful to have a detailed proposal that can be prototyped and then potentially merged.