Hyphen Restrictions

Nicolas Williams Nicolas.Williams at oracle.com
Wed Jan 5 10:25:39 CET 2011


On Wed, Jan 05, 2011 at 08:03:43AM +0000, Adam M. Costello wrote:
> Yoshiro YONEYA <yoshiro.yoneya at jprs.co.jp> wrote:
> 
> > Dear Andrew and John,
> > 
> > Thank you for your quick response.  I'm clear now.
> 
> You are?  But didn't Andrew and John disagree?  Andrew said it means 3rd
> & 4th characters, while John said it means 3rd & 4th octets.

The RFC says "characters", and I think it probably says that for a
reason.  It'd be nice if the RFC stated that reason.

My guess: the way Punycode works there's no way for the a Punycoded
string to start with an ACE prefix if it doesn't have hyphens in the 3rd
and fourth characters.  However, a quick test seems to indicate that
either that requirement is off by one or my guess is wrong:

% idn --quiet foó--bar
xn--fo--bar-m0a
% idn --quiet xnó--bar
xn--xn--bar-m0a
% idn --quiet xñ--bar 
xn--x--bar-wwa
% 

Nico
-- 


More information about the Idna-update mailing list