Status of IDNABIS Working Group

JFC Morfin jefsey at jefsey.com
Tue Feb 17 15:05:19 CET 2009


At 06:42 17/02/2009, YAO Jiankang wrote:
>
>
>now, the definitions of  A-LABELS, U-LABELS AND NR-LDH LABELS, 
>LDH-Labels , R-LDH-labels are very clear to me

yeap. However, for clarity sake, I would advise not to mix 
description and validity considerations.
Better to describe the "geography" of the names' syntax, and then 
discuss their usage within the current IDNA context?
jfc


>
>----- Original Message -----
>From: <mailto:vint at google.com>Vint Cerf
>To: <mailto:idna-update at alvestrand.no>idna-update at alvestrand.no
>Sent: Tuesday, February 17, 2009 5:00 AM
>Subject: Status of IDNABIS Working Group
>NB: THIS TEXT MUST BE READ WITH A FIXED WIDTH
>COURIER FONT FOR THE ILLUSTRATIONS TO LINE UP PROPERLY:
>A fair amount of work is underway to improve the clarity
>of the Definitions and Rationale documents and to revise
>the others  as needed to take into account proposed new
>terminology. The intent is to have as much of this work
>as possible available for WG review in time for the March
>IETF in San Francisco. Two sessions have been reserved
>during the week: one on Monday, March 23 and one on
>Tuesday, March 24.
>At that meeting we will also want to take up a comparison
>of the documents that reflect the work outlined in the
>charter and the recent proposal made by Paul Hoffman for
>an alternative to that approach.
>The revision work takes up the following tersely rendered
>set of definitions (it will be best to read the revised
>Definitions document when released for a more complete
>picture).
>The text below is intended to convey the flavor of the
>attempt to clarify definitions but is not the entire
>text that is in preparation.
>2.3.  Terminology Specific to IDNA
>    This section defines some terminology to reduce
>dependence on term and definitions that have been
>problematic in the past.
>An LDH-Label is a string consisting solely of ASCII
>upper and/or lower case letters, digits 0-9 and the hyphen
>("-"). These labels are limited to 63 characters and do
>not include a hyphen at either the beginning or end of
>the string. Some people might call this a "traditional
>host name" label.
>A new subset of LDH-Labels is defined that have the
>property that they all have a sequence of ASCII hyphens
>in the third and fourth character position from the
>beginning of the label. Roughly, in left-to-right form
>this would read "??--" where "??" is drawn from the
>traditional LDH set of characters, except that the first
>"?" cannot be a hyphen by definition of LDH-label nor can
>the last character of the label be a hyphen. This subset of
>LDH-labels is named R-LDH-labels for "reserved LDH-Labels.
>Labels that are NOT members of the R-LDH-label category are
>called the Non-Reserved-Labels or NR-LDH-Labels and they
>make up the remainder of the LDH-label universe.
>This distinction among possible LDH labels is only has
>significance for software that is "IDNA-aware". Otherwise,
>all LDH-labels meeting the definition above are accepted as
>valid by non-IDNA aware software.
>As it happens, only a subset of the R-LDH-labels can
>potentially be used in IDN-aware applications, specifically
>the class of labels that begin with the prefix ("xn--")
>[what about "XN--"?].
>This class we call "XN-labels". Of this class, only a
>subset of these that we will call "A-labels" are valid
>for use in IDNA-aware applications, namely the subset
>that is valid Punycode output limited to 59 characters
>in addition to the "xn--" prefix and which can be converted
>into valid Unicode characters by a reverse algorithm
>(cf RFC3492). Valid Unicode characters are defined by
>conformance to the Protocol, Table and BiDi  documents
>that identify which Unicode characters can be used in
>IDNA2008-aware applications.
>There is also a class of labels that are prefixed with "xn--"
>but whose remaining characters cannot be converted into
>valid Unicode, or cannot be produced using the Punycode
>encoding algorithm or that otherwise do not meet the A-label
>criteria. These we will refer to as Invalid-A-labels
>[or something like that].
>The R-LDH-labels that are neither A-labels nor
>invalid-A-labels are reserved and not permitted to be
>used in IDNA2008-aware applications.
>Labels that satisfy the LDH-Label criteria but that are
>not Reserved-LDH Labels are called Non-Reserved LDH labels
>or NR-LDH-labels.
>
>FOR IDN2008-AWARE SYSTEMS, VALID LABELS INCLUDE:
>A-LABELS, U-LABELS AND NR-LDH LABELS.
>IDNA-LABELS COME IN TWO FLAVORS: AN ACE-ENCODED FORM AND A UNICODE FORM.
>THESE ARE REFERRED TO AS A-LABELS AND U-LABELS RESPECTIVELY.
>
>                                ASCII-LABEL
>----------------------------------------------------------------
>|                                                              | 
>
>|                 LDH-LABEL (1) (4)                            |
>|          ___________________________________________________ |
>|         |                                                  | |
>|         |                                                  | |
>|         |  __________________________________              | |
>|         |  |IDN Reserved LDH Labels          |             | |
>|         |  | ("??--")   or R-LDH LABELS      |             | |
>|         |  |                                 | NONRESERVED | |
>|         |  | ------------------------------- |  LDH LABELS | |
>|         |  | |       XN LABELS             | |             | |
>|         |  | | _____________   ___________ | |             | |
>|         |  | | |           |   |          || |NR-LDH LABELS| |
>|         |  | | | A-labels  |   | Invalid  || |             | |
>|         |  | | | "xn--"(2) |   | A-labels || |             | |
>|         |  | | |___________|   |____(3)___|| |             | |
>|         |  | |_____________________________| |             | |
>|         |  |_________________________________|             | |
>|         |__________________________________________________| |
>|                                                              |
>|                                                              |
>|            NON-LDH-LABEL                                     |
>|         _______________________________________________      |
>|         |                                             |      |
>|         |         ________________________            |      |
>|         |         | SRV & SRV-LIKE       |            |      |
>|         |         | e,g, _tcp            |            |      |
>|         |         |______________________|            |      |
>|         |         ________________________            |      |
>|         |         | leading or trailing  |            |      |
>|         |         | hyphens "-abcd"      |            |      |
>|         |         | or "xyz-" or "-uvw-" |            |      |
>|         |         |______________________|            |      |
>|         |         ________________________            |      |
>|         |         | Other non-LDH        |            |      |
>|         |         | ASCII Chars          |            |      |
>|         |         | e.g. #$%&_           |            |      |
>|         |         |______________________|            |      |
>|         |_____________________________________________|      |
>|______________________________________________________________|
>
>           (1) ASCII letters (upper and lower case), digits,
>              hyphen.  Hyphen may not appear in first or last
>              position. Less than 64 characters.
>           (2) Note that the string following "xn--" must
>              be the valid output of the Punycode algorithm
>              and must be convertible into valid U-label form.
>           (3) Note that an Invalid-A-Label has a prefix "xn--"
>              but the remainder of the label is NOT the valid
>              output of the Punycode algorithm.
>           (4) LDH-LABEL subtypes are indistinguishable to IDNA-unaware
>                 applications.
>
>
>                      __________________________
>                      |  Non-ASCII             |
>                      |                        |
>                      |    ___________________ |
>                      |    | U-label (5)     | |
>                      |    |_________________| |
>                      |    |                 | |
>                      |    |  Binary Label   | |
>                      |    | (including      | |
>                      |    |  high bit on)   | |
>                      |    |_________________| |
>                      |    |                 | |
>                      |    | Bit String      | |
>                      |    |   Label         | |
>                      |    |_________________| |
>                      |________________________|
>          (5) To IDNA-unaware applications, U-labels are
>                 indistinguishable from Binary ones.
>              Figure 1: IDNA and Related DNS Terminology Space
>==================
>
>As I have understood the WG charter, the intention has been
>to devise a means to avoid specific dependence of the
>specifications on any particular instance of the Unicode
>character set. The general posture of the IDNA2008 document
>set has also attempted to maintain a one-to-one relationship
>between labels produced by the Punycode encoding algorithm and
>the associated Unicode string. In brief terms, the A-Labels and
>U-Labels of the IDNA2008 can be mapped back and forth without
>any loss or change in the respective A-label or U-label strings.
>
>Document editors are working to incorporate these new definitions
>and the sense of exchanges on the mailing list.
>As of this writing, it is my understanding that the Esszet and
>Final Sigma characters are to be treated as protocol-valid and
>that registries (in the most general sense of the word) are
>prepared to deal with the side-effects of prior registrations
>following the IDNA2003 guidelines.
>The current version of Tables rules out the use of Hangul Jamos
>per the recommendation of Korean language experts.
>There remains further discussion and resolution of the use
>of Indic digits particularly in connection with the BiDi
>specifications.
>There has also been some discussion about mapping on the list.
>The "going-in" assumption has been that the IDNA2008
>specifications do not consider formalizing mappings. Some
>mappings may occur for local reasons prior to look up or
>registration of labels in domain names. Concern has been
>raised that if mappings are not standardized and uniform
>some surprises may ensue.
>We may need to discuss whether some form of standardized
>mapping is needed, possibly to maintain least surprise
>for users accustomed to the behavior of non-IDNA
>domain names (e.g. upper/lower case equivalence
>for lookup purposes).
>
>How ever this discussion ends up, there appears to be some
>consensus that the registration process should not, in and
>of itself, involve mapping. That is: only valid U-labels
>or A-labels should be presented to the DNS system for entry
>into the DNS zone files.
>An assumption is made in the present specifications that
>any registered domain label derived from non-ASCII Unicode
>characters will be one-to-one convertible to A-label form
>from the Unicode form (U-label form) and vice-versa.
>
>I believe we will have on the agenda several items:
>1. review of the then current status of the WG documents
>    and any then known unresolved questions or issues
>2. Consideration of Paul Hoffman's alternative proposal
>    to extend IDNA2003
>3. Discussion of the role of mapping from the IDNA2008
>    perspective.
>
>I will prepare a more precise agenda along with issues to
>be discussed and resolved as the time approaches for our
>meeting in March.
>Vint
>Vint Cerf
>Google
>1818 Library Street, Suite 400
>Reston, VA 20190
>202-370-5637
><mailto:vint at google.com>vint at google.com
>
>Vint Cerf
>Google
>1818 Library Street, Suite 400
>Reston, VA 20190
>202-370-5637
><mailto:vint at google.com>vint at google.com
>
>
>
>
>
>----------
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update
>
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090217/26b0c4ea/attachment-0001.htm 


More information about the Idna-update mailing list