Hyphen Restrictions
Andrew Sullivan
ajs at shinkuro.com
Wed Jan 5 07:28:34 CET 2011
On Wed, Jan 05, 2011 at 03:18:20PM +0900, Yoshiro YONEYA wrote:
> Or does it mean third and fourth character from the beginning of the string?
> For example:
> beginning of the string
> |
> v 1 2 3 4 5 <-- position of character
> +---+---+---+---+---+
> |<A>|<B>| - | - |<C>| here <A>, <B> and <C> stands for non-ASCII (multi-
> +---+---+---+---+---+ octets) character
> ^ ^
> | |
> two consecutive hyphens
I believe the intention is this one. The target is "the Unicode
string". At one point in the development of IDNA2008, I think this
was called a "putative U-label", if I recall correctly. The idea was
that you had an inbound Unicode string that was supposed to be a
U-label, but you didn't know yet.
That this is the correct interpretation is suggested by section 4.4,
which talks about converting the whole thing to an A-label by doing
the Punycode conversion. That suggests that previous "labels" in 4.x
were only ever putative U-labels or else they were A-labels.
The above is merely my interpretation; I hold no special authority.
A
--
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.
More information about the Idna-update
mailing list