Suggested kinds of labels (to clarify the discussion)

Shawn Steele (???) Shawn.Steele at microsoft.com
Wed Mar 25 01:39:24 CET 2009


I hope I’m not jumping the gun, but I thought I’d suggest this picture to clarify the definitions for ideas people are talking about.



I was thinking of the discussion and definitions and think this might work as a starting point. (I added A-Label to a picture from Mark Davis):



Unicode String (Any old Unicode string, maybe valid or invalid; won't know until tested)

  - subset: M-Label (maps to a valid U-Label, perhaps with the identity mapping, many to one)

    - subset: U-Label (canonical)

      - implies: A-Label (derived from U-Label, one-to-one)



I think the group pretty much agrees about U-Labels and A-Labels in this context, although there may be some disagreement about which is canonical.



M-Label is the idea that there are mapped strings (side of the bus strings) that may not be valid U-Labels, but which will map to a valid U-Label.  Eg: this is mapping.  I don’t think it depends on whether mapping is a BCP or Appendix or a different RFC, it’s just the idea of a label that can be mapped.  M-Labels map to U-Labels in a many to 1 relationship.



I stated that A-Labels are derived from the canonical U-Labels, but, of course, they must also round trip.



Unicode String is any random string, some of which will be valid IDN labels, others won’t.



- Shawn




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090325/dc21e138/attachment.htm 


More information about the Idna-update mailing list