Hyphen Restrictions
Yoshiro YONEYA
yoshiro.yoneya at jprs.co.jp
Wed Jan 5 07:18:20 CET 2011
Hi, all,
I need clarification of RFC5891 section 4.2.3.1, which says:
4.2.3.1. Hyphen Restrictions
The Unicode string MUST NOT contain "--" (two consecutive hyphens) in
the third and fourth character positions and MUST NOT start or end
with a "-" (hyphen).
My question is that what "the third and fourth character positions" means.
Does it mean third and fourth octet from the beginning of the string?
For example:
beginning of the string
|
v 1 2 3 4 5 <-- position of octet
+---+---+---+---+---+
| a | b | - | - | c |
+---+---+---+---+---+
^ ^
| |
two consecutive hyphens
Or does it mean third and fourth character from the beginning of the string?
For example:
beginning of the string
|
v 1 2 3 4 5 <-- position of character
+---+---+---+---+---+
|<A>|<B>| - | - |<C>| here <A>, <B> and <C> stands for non-ASCII (multi-
+---+---+---+---+---+ octets) character
^ ^
| |
two consecutive hyphens
My understanding for this restrictions is to preserve future ACE prefix,
so I expect the answer for my question is former one. Is that right?
Regards,
--
Yoshiro YONEYA <yoshiro.yoneya at jprs.co.jp>
More information about the Idna-update
mailing list