Suggested kinds of labels (to clarify the discussion)

Shawn Steele (???) Shawn.Steele at
Wed Mar 25 01:39:24 CET 2009

I hope I’m not jumping the gun, but I thought I’d suggest this picture to clarify the definitions for ideas people are talking about.

I was thinking of the discussion and definitions and think this might work as a starting point. (I added A-Label to a picture from Mark Davis):

Unicode String (Any old Unicode string, maybe valid or invalid; won't know until tested)

  - subset: M-Label (maps to a valid U-Label, perhaps with the identity mapping, many to one)

    - subset: U-Label (canonical)

      - implies: A-Label (derived from U-Label, one-to-one)

I think the group pretty much agrees about U-Labels and A-Labels in this context, although there may be some disagreement about which is canonical.

M-Label is the idea that there are mapped strings (side of the bus strings) that may not be valid U-Labels, but which will map to a valid U-Label.  Eg: this is mapping.  I don’t think it depends on whether mapping is a BCP or Appendix or a different RFC, it’s just the idea of a label that can be mapped.  M-Labels map to U-Labels in a many to 1 relationship.

I stated that A-Labels are derived from the canonical U-Labels, but, of course, they must also round trip.

Unicode String is any random string, some of which will be valid IDN labels, others won’t.

- Shawn

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Idna-update mailing list