Comments on draft-ietf-idnabis-defs-10

Sat Aug 29 09:17:42 CEST 2009

--On Monday, August 24, 2009 23:24 -0400 Andrew Sullivan
<ajs at shinkuro.com> wrote:

> Dear colleagues,
> 
> As part of an effort to respond to the current WGLC of multiple
> documents, I have read draft-ietf-idnabis-defs-10.  I am
> grateful to the Chair for extending the deadline, and
> apologetic to the editors that they've been made to wait.  I
> hope to be able to offer any useful comments I might have
> (assuming there are such) before the new deadline.  
> 
> In a previous comment (see
> http://www.alvestrand.no/pipermail/idna-update/2009-July/00497
> 0.html), I made a vague remark about something I find
> worrisome in this text in §2.3.2.1:
>...

These changes, with Paul's suggested modifications, have been
tentatively accepted and incorporated in the document.  Anyone
who objects should say so quickly.

>...
>   o  A "U-label" is an IDNA-valid string of Unicode
> characters, in       normalization form NFC and including at
> least one non-ASCII       character, expressed in a standard
> Unicode Encoding Form (in an       Internet transmission
> context this will normally be UTF-8). 
> 
> The parenthetical remark, I think, encourages implementers not
> to recognise as U-labels strings that come in as (say) UTF-32,
> but that are otherwise perfectly valid.  Who cares what is
> normal in an Internet transmission context, when we're
> defining terms?  Why does that matter?

The comment was made because there is no requirement at all in
IDNA (either 2003 or 2008) that UTF-8 be used; many applications
on particular operating systems actually use something else
(UTF-16 is most common).   But I dropped the additional text.
It now just says "(such as UTF-8)" as Paul suggested.  Again,
anyone who doesn't like this should speak up.

>...

   john