Comments on draft-ietf-idnabis-mappings-00

Wed May 27 16:45:21 CEST 2009

Hi Pete

I would like to propose the following wordsmithing:

1.  Introduction

   This document specifies the operations that applications apply to
   user input in order to get it into a form acceptable by the
   Internationalized Domain Names in Applications (IDNA) protocol

   [I-D.ietf-idnabis-protocol].  

Change to:

1.	This document specifies the operations that should be applied to
user input in order to generate a form that is acceptable to the
Internationalized Domain Names in Applications (IDNA) protocol.

The document describes the architectural principles that underly this
function in section 2,
   describes a general procedure that an application SHOULD implement in
   section 3, and specifies an algorithm and mapping that an application
   MAY implement in order to remain reasonably backward compatible with
   the original version of the IDNA protocol in appendix A.

Propose change to:

This document describes the underlying architectural principles (Section 2)
and the general implementation procedure (Section 3) as well as an algorithm
with mappings in order to facilitate backwards compatibility (Appendix A).

It should be noted that this document is NOT specifying the behavior
   of a protocol that appears "on the wire".  It specifies an operation
   that is to be applied to user input in order to prepare that user
   input for use in an "on the network" protocol.  As unusual as this
   may be for an IETF protocol document, it is a necessary operation to
   maintain interoperability.

I don’t like the use of “on the wire” in this paragraph but cannot come up
with anything sensible as I am not sure what you mean.

2.  Architectural Principles

   An application that implements the IDNA protocol
   [I-D.ietf-idnabis-protocol] must take a set of user input and convert
   that input to a set of Unicode code points.  That user input might be
   typed on a keyboard, written by hand onto some sort of digitizer,
   spoken into a microphone and interpreted by a speech-to-text engine,

   or otherwise.

Propose change to:

An application that implements the IDNA Protocol [I-D etc] MUST (or did you
really mean must?) take any user input set (use string here instead of set?)
and convert it to a set of Unicode code points.  The user input may be
acquired by any of several different input methods all with differing
conversion processes to be taken into consideration e.g. keyboard input,
digitized hand writing, speech recording devices converted into text by
speech-to-text engine.

2. (cont.)

The process of taking any particular user input and
   mapping it into a Unicode code point may be a simple one: If a user
   strikes the "A" key on a US English keyboard, without any modifiers
   such as the "Shift" key held down, in order to draw a Latin small
   letter A ("a"), many (perhaps most) modern operating system input
   methods will produce to the calling application the code point
   U+0061, encoded in a single octet.  Sometimes the process is somewhat
   more complicated: A user might strike a particular set of keys to
   represent a combining macron followed by striking the "A" key in
   order to draw a Latin small letter A with a macron above it.
   Depending on the operating system, the input method chosen by the
   user, and even the parameters with which the application communicates
   with the input method, the result might be the code point U+0101
   (encoded as two octets in UTF-8 or UTF-16, four octets in UTF-32,
   etc.), the code point U+0061 followed by the code point U+0304 (again,
encoded in three or more octets, depending upon the encoding
   used) or even the code point U+FF41 followed by the code point U+0304
   (and encoded in some form).  And these examples leave aside the issue
   of operating systems and input methods that do not use Unicode code
   points for their character set.  In every case, applications (with
   the help of the operating systems on which they run and the input
   methods used) MUST perform a mapping from user input into Unicode
   code points.

I think this is rather long winded and confusing and would propose the
following:

Processes need to take into consideration that there might be several ways
in which user input can be converted into corresponding Unicode code points
and that this can be dependant on several contributing factors e.g.
operating system, input method.  For example:  A user might strike a
particular set of keys to represent a combining macron followed by striking
the "A" key in order to draw a Latin small letter A with a macron above it.
The result might be the code point U+0101 (encoded as two octets in UTF-8 or
UTF-16, four octets in UTF-32, etc.), the code point U+0061 followed by the
code point U+0304 (again, encoded in three or more octets, depending upon
the encoding used) or even the code point U+FF41 followed by the code point
U+0304. In designing implementation processes consideration should also be
given to systems that do not use Unicode code points for their character
set. 

Please correct the following typos:

Last paragraph section 2.

In the next section, this document specifies a general algorithm that
   applications SHOULD implement in order to produce Unicode code points
   that will be valid under the IDNA protocol.  Then, in appendix A, a
   full mapping is specified that is substantially compatible with the
   original IDNA protocol.  An application MAY implement the full
   mapping or MAY choose a different mapping.

Last paragraph Section 3

These are the minimal mappings that an application SHOULD do.  Of
   course, there are many others that MAY be done.  In particular, a
   mapping that is (not in) substantially compatible with [RFC3490] appears
below
   in appendix A.

I have run out of time today to comment further but if you would like me to
go through the document paragraph by paragraph let me know (it is easy doing
revisions – the difficulty is in writing the first draft ;-)  Well done!)
Feel free to accept or reject proposals.

Best regards

Debbie

Debbie Garside 
Managing Director 

ICT Marketing Ltd.
Corner House
Barn Street
Haverfordwest
Pembrokeshire SA61 2RD 
Tel: +44 (0)1437 766441 Fax: +44 (0)1437 766173
HYPERLINK "http://www.ictmarketing.co.uk"Web: http://www.ictmarketing.co.uk 

Internal Virus Database is out-of-date.
Checked by AVG. 
Version: 7.5.557 / Virus Database: 270.12.11/2089 - Release Date: 30/04/2009
17:53

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090527/ee97c2a3/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 1088 bytes
Desc: not available
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20090527/ee97c2a3/attachment-0001.jpeg