Suggested kinds of labels (to clarify the discussion)
Shawn Steele (???)
Shawn.Steele at microsoft.com
Wed Mar 25 01:39:24 CET 2009
I hope I’m not jumping the gun, but I thought I’d suggest this picture to clarify the definitions for ideas people are talking about.
I was thinking of the discussion and definitions and think this might work as a starting point. (I added A-Label to a picture from Mark Davis):
Unicode String (Any old Unicode string, maybe valid or invalid; won't know until tested)
- subset: M-Label (maps to a valid U-Label, perhaps with the identity mapping, many to one)
- subset: U-Label (canonical)
- implies: A-Label (derived from U-Label, one-to-one)
I think the group pretty much agrees about U-Labels and A-Labels in this context, although there may be some disagreement about which is canonical.
M-Label is the idea that there are mapped strings (side of the bus strings) that may not be valid U-Labels, but which will map to a valid U-Label. Eg: this is mapping. I don’t think it depends on whether mapping is a BCP or Appendix or a different RFC, it’s just the idea of a label that can be mapped. M-Labels map to U-Labels in a many to 1 relationship.
I stated that A-Labels are derived from the canonical U-Labels, but, of course, they must also round trip.
Unicode String is any random string, some of which will be valid IDN labels, others won’t.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update