Status of IDNABIS Working Group
JFC Morfin
jefsey at jefsey.com
Tue Feb 17 15:05:19 CET 2009
At 06:42 17/02/2009, YAO Jiankang wrote:
>
>
>now, the definitions of A-LABELS, U-LABELS AND NR-LDH LABELS,
>LDH-Labels , R-LDH-labels are very clear to me
yeap. However, for clarity sake, I would advise not to mix
description and validity considerations.
Better to describe the "geography" of the names' syntax, and then
discuss their usage within the current IDNA context?
jfc
>
>----- Original Message -----
>From: <mailto:vint at google.com>Vint Cerf
>To: <mailto:idna-update at alvestrand.no>idna-update at alvestrand.no
>Sent: Tuesday, February 17, 2009 5:00 AM
>Subject: Status of IDNABIS Working Group
>NB: THIS TEXT MUST BE READ WITH A FIXED WIDTH
>COURIER FONT FOR THE ILLUSTRATIONS TO LINE UP PROPERLY:
>A fair amount of work is underway to improve the clarity
>of the Definitions and Rationale documents and to revise
>the others as needed to take into account proposed new
>terminology. The intent is to have as much of this work
>as possible available for WG review in time for the March
>IETF in San Francisco. Two sessions have been reserved
>during the week: one on Monday, March 23 and one on
>Tuesday, March 24.
>At that meeting we will also want to take up a comparison
>of the documents that reflect the work outlined in the
>charter and the recent proposal made by Paul Hoffman for
>an alternative to that approach.
>The revision work takes up the following tersely rendered
>set of definitions (it will be best to read the revised
>Definitions document when released for a more complete
>picture).
>The text below is intended to convey the flavor of the
>attempt to clarify definitions but is not the entire
>text that is in preparation.
>2.3. Terminology Specific to IDNA
> This section defines some terminology to reduce
>dependence on term and definitions that have been
>problematic in the past.
>An LDH-Label is a string consisting solely of ASCII
>upper and/or lower case letters, digits 0-9 and the hyphen
>("-"). These labels are limited to 63 characters and do
>not include a hyphen at either the beginning or end of
>the string. Some people might call this a "traditional
>host name" label.
>A new subset of LDH-Labels is defined that have the
>property that they all have a sequence of ASCII hyphens
>in the third and fourth character position from the
>beginning of the label. Roughly, in left-to-right form
>this would read "??--" where "??" is drawn from the
>traditional LDH set of characters, except that the first
>"?" cannot be a hyphen by definition of LDH-label nor can
>the last character of the label be a hyphen. This subset of
>LDH-labels is named R-LDH-labels for "reserved LDH-Labels.
>Labels that are NOT members of the R-LDH-label category are
>called the Non-Reserved-Labels or NR-LDH-Labels and they
>make up the remainder of the LDH-label universe.
>This distinction among possible LDH labels is only has
>significance for software that is "IDNA-aware". Otherwise,
>all LDH-labels meeting the definition above are accepted as
>valid by non-IDNA aware software.
>As it happens, only a subset of the R-LDH-labels can
>potentially be used in IDN-aware applications, specifically
>the class of labels that begin with the prefix ("xn--")
>[what about "XN--"?].
>This class we call "XN-labels". Of this class, only a
>subset of these that we will call "A-labels" are valid
>for use in IDNA-aware applications, namely the subset
>that is valid Punycode output limited to 59 characters
>in addition to the "xn--" prefix and which can be converted
>into valid Unicode characters by a reverse algorithm
>(cf RFC3492). Valid Unicode characters are defined by
>conformance to the Protocol, Table and BiDi documents
>that identify which Unicode characters can be used in
>IDNA2008-aware applications.
>There is also a class of labels that are prefixed with "xn--"
>but whose remaining characters cannot be converted into
>valid Unicode, or cannot be produced using the Punycode
>encoding algorithm or that otherwise do not meet the A-label
>criteria. These we will refer to as Invalid-A-labels
>[or something like that].
>The R-LDH-labels that are neither A-labels nor
>invalid-A-labels are reserved and not permitted to be
>used in IDNA2008-aware applications.
>Labels that satisfy the LDH-Label criteria but that are
>not Reserved-LDH Labels are called Non-Reserved LDH labels
>or NR-LDH-labels.
>
>FOR IDN2008-AWARE SYSTEMS, VALID LABELS INCLUDE:
>A-LABELS, U-LABELS AND NR-LDH LABELS.
>IDNA-LABELS COME IN TWO FLAVORS: AN ACE-ENCODED FORM AND A UNICODE FORM.
>THESE ARE REFERRED TO AS A-LABELS AND U-LABELS RESPECTIVELY.
>
> ASCII-LABEL
>----------------------------------------------------------------
>| |
>
>| LDH-LABEL (1) (4) |
>| ___________________________________________________ |
>| | | |
>| | | |
>| | __________________________________ | |
>| | |IDN Reserved LDH Labels | | |
>| | | ("??--") or R-LDH LABELS | | |
>| | | | NONRESERVED | |
>| | | ------------------------------- | LDH LABELS | |
>| | | | XN LABELS | | | |
>| | | | _____________ ___________ | | | |
>| | | | | | | || |NR-LDH LABELS| |
>| | | | | A-labels | | Invalid || | | |
>| | | | | "xn--"(2) | | A-labels || | | |
>| | | | |___________| |____(3)___|| | | |
>| | | |_____________________________| | | |
>| | |_________________________________| | |
>| |__________________________________________________| |
>| |
>| |
>| NON-LDH-LABEL |
>| _______________________________________________ |
>| | | |
>| | ________________________ | |
>| | | SRV & SRV-LIKE | | |
>| | | e,g, _tcp | | |
>| | |______________________| | |
>| | ________________________ | |
>| | | leading or trailing | | |
>| | | hyphens "-abcd" | | |
>| | | or "xyz-" or "-uvw-" | | |
>| | |______________________| | |
>| | ________________________ | |
>| | | Other non-LDH | | |
>| | | ASCII Chars | | |
>| | | e.g. #$%&_ | | |
>| | |______________________| | |
>| |_____________________________________________| |
>|______________________________________________________________|
>
> (1) ASCII letters (upper and lower case), digits,
> hyphen. Hyphen may not appear in first or last
> position. Less than 64 characters.
> (2) Note that the string following "xn--" must
> be the valid output of the Punycode algorithm
> and must be convertible into valid U-label form.
> (3) Note that an Invalid-A-Label has a prefix "xn--"
> but the remainder of the label is NOT the valid
> output of the Punycode algorithm.
> (4) LDH-LABEL subtypes are indistinguishable to IDNA-unaware
> applications.
>
>
> __________________________
> | Non-ASCII |
> | |
> | ___________________ |
> | | U-label (5) | |
> | |_________________| |
> | | | |
> | | Binary Label | |
> | | (including | |
> | | high bit on) | |
> | |_________________| |
> | | | |
> | | Bit String | |
> | | Label | |
> | |_________________| |
> |________________________|
> (5) To IDNA-unaware applications, U-labels are
> indistinguishable from Binary ones.
> Figure 1: IDNA and Related DNS Terminology Space
>==================
>
>As I have understood the WG charter, the intention has been
>to devise a means to avoid specific dependence of the
>specifications on any particular instance of the Unicode
>character set. The general posture of the IDNA2008 document
>set has also attempted to maintain a one-to-one relationship
>between labels produced by the Punycode encoding algorithm and
>the associated Unicode string. In brief terms, the A-Labels and
>U-Labels of the IDNA2008 can be mapped back and forth without
>any loss or change in the respective A-label or U-label strings.
>
>Document editors are working to incorporate these new definitions
>and the sense of exchanges on the mailing list.
>As of this writing, it is my understanding that the Esszet and
>Final Sigma characters are to be treated as protocol-valid and
>that registries (in the most general sense of the word) are
>prepared to deal with the side-effects of prior registrations
>following the IDNA2003 guidelines.
>The current version of Tables rules out the use of Hangul Jamos
>per the recommendation of Korean language experts.
>There remains further discussion and resolution of the use
>of Indic digits particularly in connection with the BiDi
>specifications.
>There has also been some discussion about mapping on the list.
>The "going-in" assumption has been that the IDNA2008
>specifications do not consider formalizing mappings. Some
>mappings may occur for local reasons prior to look up or
>registration of labels in domain names. Concern has been
>raised that if mappings are not standardized and uniform
>some surprises may ensue.
>We may need to discuss whether some form of standardized
>mapping is needed, possibly to maintain least surprise
>for users accustomed to the behavior of non-IDNA
>domain names (e.g. upper/lower case equivalence
>for lookup purposes).
>
>How ever this discussion ends up, there appears to be some
>consensus that the registration process should not, in and
>of itself, involve mapping. That is: only valid U-labels
>or A-labels should be presented to the DNS system for entry
>into the DNS zone files.
>An assumption is made in the present specifications that
>any registered domain label derived from non-ASCII Unicode
>characters will be one-to-one convertible to A-label form
>from the Unicode form (U-label form) and vice-versa.
>
>I believe we will have on the agenda several items:
>1. review of the then current status of the WG documents
> and any then known unresolved questions or issues
>2. Consideration of Paul Hoffman's alternative proposal
> to extend IDNA2003
>3. Discussion of the role of mapping from the IDNA2008
> perspective.
>
>I will prepare a more precise agenda along with issues to
>be discussed and resolved as the time approaches for our
>meeting in March.
>Vint
>Vint Cerf
>Google
>1818 Library Street, Suite 400
>Reston, VA 20190
>202-370-5637
><mailto:vint at google.com>vint at google.com
>
>Vint Cerf
>Google
>1818 Library Street, Suite 400
>Reston, VA 20190
>202-370-5637
><mailto:vint at google.com>vint at google.com
>
>
>
>
>
>----------
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update
>
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090217/26b0c4ea/attachment-0001.htm
More information about the Idna-update
mailing list