Tuning of Defs, Protocol, and Rationale

John C Klensin klensin at jck.com
Fri Nov 14 18:51:47 CET 2008


Hi.

During the two weeks since the current versions of
"Definitions", "Protocol", and "Rationale" were posted, I've
received and integrated a number of suggestions about changes
that seemed either editorial or obvious.

In order that everyone be looking at the same documents at the
meeting, I don't intend to try to post new versions early
Monday; these changes will show up after the WG meetings with
whatever else is incorporated as a result of meeting discussions.

However, if anyone wants to check on whether their particular
suggestions have been incorporated, or wants an advance look at
whether I've messed something up, diffs between the posted
documents and the working versions are attached.  

Again, I do not consider any of these changes to be terribly
significant, even though a few of them resolve problems that
have been pointed out on-list.  The Change Log entries for the
three are as follows (obviously, these are subject to revision
too, possibly even as I catch more problems over the weekend).
I've removed the TOC listings from the diff files -- they didn't
contribute much and take up a lot of space.

    john

-------------

Definitions Version -02
 
    o  All back pointers to section numbers in Rationale have
	   been removed.
 
    o  Some definitions clarified.  Added one about string
	   order. 


Protocol Version -07

    o  Multiple small textual and editorial changes and
	   clarifications. 
 
    o  Requirement for normalization clarified to apply to all
	   cases and conditions for preprocessing further clarified.


Rationale Version -05
 
    o  Many small editorial changes, including changes to
	   eliminate the last vestiges of what appeared to be 2119
	   language (upper-case MUST, SHOULD, or MAY).
 
-------------- next part --------------
** Rationale -05 to -05 ***

   Internationalized Domain Names for Applications (IDNA): Background,
                        Explanation, and Rationale


Section 1.6., paragraph 5:
OLD:

    Operations for converting between local character sets and normalized
    Unicode are part of this general set of user interface issues.  The
    conversion is obviously not required at all in a Unicode-native
    system that maintains all strings in Normalization Form C (NFC).  It
    may, however, involve some complexity in a system that is not
    Unicode-native, especially if the elements of the local character set
    do not map exactly and unambiguously into Unicode characters or do so
    in a way that is not completely stable over time.  Perhaps more
    important, if a label being converted to a local character set
    contains Unicode characters that have no correspondence in that
    character set, the application may have to apply special, locally-
    appropriate, methods to avoid or reduce loss of information.

NEW:

    Operations for converting between local character sets and normalized
    Unicode are part of this general set of user interface issues.  The
    conversion is obviously not required at all in a Unicode-native
    system that maintains all strings in Normalization Form C (NFC).
    (See [Unicode-UAX15] for precise definitions of NFC and NFKC if
    needed.)  It may, however, involve some complexity in a system that
    is not Unicode-native, especially if the elements of the local
    character set do not map exactly and unambiguously into Unicode
    characters or do so in a way that is not completely stable over time.
    Perhaps more important, if a label being converted to a local
    character set contains Unicode characters that have no correspondence
    in that character set, the application may have to apply special,
    locally-appropriate, methods to avoid or reduce loss of information.


Section 3.1.3., paragraph 1:
OLD:

    For convenience in processing and table-building, code points that do
    not have assigned values in a given version of Unicode are treated as
    belonging to a special UNASSIGNED category.  Such code points MUST
    NOT appear in labels to be registered or looked up.  The category
    differs from DISALLOWED in that code points are moved out of it by
    the simple expedient of being assigned in a later version of Unicode
    (at which point, they are classified into one of the other categories
    as appropriate).

NEW:

    For convenience in processing and table-building, code points that do
    not have assigned values in a given version of Unicode are treated as
    belonging to a special UNASSIGNED category.  Such code points are
    prohibited in labels to be registered or looked up.  The category
    differs from DISALLOWED in that code points are moved out of it by
    the simple expedient of being assigned in a later version of Unicode
    (at which point, they are classified into one of the other categories
    as appropriate).


Section 3.2., paragraph 1:
OLD:

    While these recommendations cannot and should not define registry
    policies, registries SHOULD develop and apply additional restrictions
    to reduce confusion and other problems.  For example, it is generally
    believed that labels containing characters from more than one script
    are a bad practice although there may be some important exceptions to
    that principle.  Some registries may choose to restrict registrations
    to characters drawn from a very small number of scripts.  For many
    scripts, the use of variant techniques such as those as described in
    RFC 3843 [RFC3743] and RFC 4290 [RFC4290], and illustrated for
    Chinese by the tables described in RFC 4713 [RFC4713] may be helpful
    in reducing problems that might be perceived by users.

NEW:

    While these recommendations cannot and should not define registry
    policies, registries should develop and apply additional restrictions
    to reduce confusion and other problems.  For example, it is generally
    believed that labels containing characters from more than one script
    are a bad practice although there may be some important exceptions to
    that principle.  Some registries may choose to restrict registrations
    to characters drawn from a very small number of scripts.  For many
    scripts, the use of variant techniques such as those as described in
    RFC 3843 [RFC3743] and RFC 4290 [RFC4290], and illustrated for
    Chinese by the tables described in RFC 4713 [RFC4713] may be helpful
    in reducing problems that might be perceived by users.


Section 4.2., paragraph 2:
OLD:

    An IDNA-aware application can accept and display internationalized
    domain names in two formats: the internationalized character set(s)
    supported by the application (i.e., an appropriate local
    representation of a U-label), and as an A-label.  Applications MAY
    allow the display of A-labels, but are encouraged to not do so except
    as an interface for special purposes, possibly for debugging, or to
    cope with display limitations.  In general, they SHOULD allow, but
    not encourage, user input of that label form.  A-labels are opaque
    and ugly and malicious variations on them are not easily detected by
    users.  Where possible, they should thus only be exposed to users and
    in contexts in which they are absolutely needed.  Because IDN labels
    can be rendered either as A-labels or U-labels, the application may
    reasonably have an option for the user to select the preferred method
    of display; if it does, rendering the U-label should normally be the
    default.

NEW:

    An IDNA-aware application can accept and display internationalized
    domain names in two formats: the internationalized character set(s)
    supported by the application (i.e., an appropriate local
    representation of a U-label), and as an A-label.  Applications may
    allow the display of A-labels, but are encouraged to not do so except
    as an interface for special purposes, possibly for debugging, or to
    cope with display limitations.  In general, they should allow, but
    not encourage, user input of that label form.  A-labels are opaque
    and ugly and malicious variations on them are not easily detected by
    users.  Where possible, they should thus only be exposed to users and
    in contexts in which they are absolutely needed.  Because IDN labels
    can be rendered either as A-labels or U-labels, the application may
    reasonably have an option for the user to select the preferred method
    of display; if it does, rendering the U-label should normally be the
    default.


Section 4.2., paragraph 4:
OLD:

    In protocols and document formats that define how to handle
    specification or negotiation of charsets, labels can be encoded in
    any charset allowed by the protocol or document format.  If a
    protocol or document format only allows one charset, the labels MUST
    be given in that charset.  Of course, not all charsets can properly
    represent all labels.  If a U-label cannot be displayed in its
    entirety, the only choice (without loss of information) may be to
    display the A-label.

NEW:

    In protocols and document formats that define how to handle
    specification or negotiation of charsets, labels can be encoded in
    any charset allowed by the protocol or document format.  If a
    protocol or document format only allows one charset, the labels must
    be given in that charset.  Of course, not all charsets can properly
    represent all labels.  If a U-label cannot be displayed in its
    entirety, the only choice (without loss of information) may be to
    display the A-label.


Section 4.2., paragraph 5:
OLD:

    In any place where a protocol or document format allows transmission
    of the characters in internationalized labels, labels SHOULD be
    transmitted using whatever character encoding and escape mechanism
    the protocol or document format uses at that place.  This provision
    is intended to prevent situations in which, e.g., UTF-8 domain names
    appear embedded in text that is otherwise in some other character
    coding.

NEW:

    In any place where a protocol or document format allows transmission
    of the characters in internationalized labels, labels should be
    transmitted using whatever character encoding and escape mechanism
    the protocol or document format uses at that place.  This provision
    is intended to prevent situations in which, e.g., UTF-8 domain names
    appear embedded in text that is otherwise in some other character
    coding.


Section 6., paragraph 4:
OLD:

    As discussed elsewhere in this document, the IDNA2008 model removes
    all of these mappings and interpretations, including the equivalence
    of different forms of dots, from the protocol, discouraging such
    mappings and leaving them, when necessary, to local processing.  This
    should not be taken to imply that local processing is optional or can
    be avoided entirely, even if doing so might have been desirable in a
    world without IDNA2003 IDNs in files and archives.  Instead, unless
    the program context is such that it is known that any IDNs that
    appear will be either U-labels or A-labels, or that other forms can
    safely be rejected, some local processing of apparent domain name
    strings will be required, both to maintain compatibility with
    IDNA2003 and to prevent user astonishment.  Such local processing,
    while not specified in this document or the associated ones, will
    generally take one of two forms:

NEW:

    As discussed elsewhere in this document, the IDNA2008 model removes
    all of these mappings and interpretations, including the equivalence
    of different forms of dots, from the protocol, discouraging such
    mappings and leaving them, when necessary, to local processing.  This
    should not be taken to imply that local processing is optional or can
    be avoided entirely, even if doing so might have been desirable in a
    world without IDNA2003 IDNs in files and archives.  Instead, unless
    the program context is such that it is known that any IDNs that
    appear will contain either U-label or A-label forms, or that other
    forms can safely be rejected, some local processing of apparent
    domain name strings will be required, both to maintain compatibility
    with IDNA2003 and to prevent user astonishment.  Such local
    processing, while not specified in this document or the associated
    ones, will generally take one of two forms:


Section 7.1.2., paragraph 2:
OLD:

    o  Any label that appears to be an A-label, i.e., any label that
       starts in "xn--", MUST be IDNA-valid, i.e., they MUST be valid
       A-labels, as discussed in Section 2 above.

NEW:

    o  Any label that appears to be an A-label, i.e., any label that
       starts in "xn--", must be IDNA-valid, i.e., they must be valid
       A-labels, as discussed in Section 2 above.


Section 7.1.2., paragraph 3:
OLD:

    o  The Unicode tables (i.e., tables of code points, character
       classes, and properties) and IDNA tables (i.e., tables of
       contextual rules such as those that appear in the Tables
       document), MUST be consistent on the systems performing or
       validating labels to be registered.  Note that this does not
       require that tables reflect the latest version of Unicode, only
       that all tables used on a given system are consistent with each
       other.

NEW:

    o  The Unicode tables (i.e., tables of code points, character
       classes, and properties) and IDNA tables (i.e., tables of
       contextual rules such as those that appear in the Tables
       document), must be consistent on the systems performing or
       validating labels to be registered.  Note that this does not
       require that tables reflect the latest version of Unicode, only
       that all tables used on a given system are consistent with each
       other.


Section 7.1.2., paragraph 5:
OLD:

    Systems looking up or resolving DNS labels, especially IDN DNS
    labels, MUST be able to assume that applicable registration rules
    were followed for names entered into the DNS.

NEW:

    Systems looking up or resolving DNS labels, especially IDN DNS
    labels, must be able to assume that applicable registration rules
    were followed for names entered into the DNS.


Section 7.4.3., paragraph 1:
OLD:

    While it might be possible to make a prefix change, the costs of such
    a change are considerable.  Even if they wanted to do so, all
    registries could not convert all IDNA2003 ("xn--") registrations to a
    new form at the same time and synchronize that change with
    applications supporting lookup.  Unless all existing registrations
    were simply to be declared invalid (and perhaps even then) systems
    that needed to support both labels with old prefixes and labels with
    new ones would first process a putative label under the IDNA2008
    rules and try to look it up and then, if it were not found, would
    process the label under IDNA2003 rules and look it up again.  That
    process could significantly slow down all processing that involved
    IDNs in the DNS especially since, in principle, a fully-qualified
    name could contain a mixture of labels that were registered with the
    old and new prefixes, a situation that would make the use of DNS
    caching very difficult.  In addition, looking up the same input
    string as two separate A-labels would create some potential for
    confusion and attacks, since they could, in principle, map to
    different targets and then resolve to different DNS label nodes.

NEW:

    While it might be possible to make a prefix change, the costs of such
    a change are considerable.  Even if they wanted to do so, all
    registries could not convert all IDNA2003 ("xn--") registrations to a
    new form at the same time and synchronize that change with
    applications supporting lookup.  Unless all existing registrations
    were simply to be declared invalid (and perhaps even then) systems
    that needed to support both labels with old prefixes and labels with
    new ones would first process a putative label under the IDNA2008
    rules and try to look it up and then, if it were not found, would
    process the label under IDNA2003 rules and look it up again.  That
    process could significantly slow down all processing that involved
    IDNs in the DNS especially since, in principle, a fully-qualified
    name could contain a mixture of labels that were registered with the
    old and new prefixes, a situation that would make the use of DNS
    caching very difficult.  In addition, looking up the same input
    string as two separate A-labels would create some potential for
    confusion and attacks, since they could, in principle, map to
    different targets and then resolve to different entries in the DNS.


Section 7.7., paragraph 2:
OLD:

    In IDNA2008, strings containing unassigned code points MUST NOT be
    either looked up or registered.  There are several reasons for this,
    with the most important ones being:

NEW:

    In IDNA2008, strings containing unassigned code points must not be
    either looked up or registered.  There are several reasons for this,
    with the most important ones being:


Appendix A., paragraph 21:
OLD:

    o  Material on differences between IDNA2003 and IDNA2003 moved to an
       appendix in Protocol.

NEW:

    o  Material on differences between IDNA2003 and IDNA2008 moved to an
       appendix in Protocol.


Appendix A., paragraph 27:
OLD:

 Author's Address

NEW:

 A.5.  Version -05
 
    o  Many small editorial changes, including changes to eliminate the
       last vestiges of what appeared to be 2119 language (upper-case
       MUST, SHOULD, or MAY).
 
 Author's Address

-------------- next part --------------
** Protocol 06 to 07 ****

     Internationalized Domain Names in Applications (IDNA): Protocol

Section 1., paragraph 0:
OLD:

    1.  Whenever a domain name is put into an IDN-unaware domain name
        slot (see Section 2 and [IDNA2008-Rationale]), it MUST contain
        only ASCII characters (i.e., must be either an A-label or an LDH-
        label), or must be a label associated with a DNS application that
        is not subject to either IDNA or the historical recommendations
        for "hostname"-style names [RFC1034].

NEW:

    1.  Whenever a domain name is put into an IDN-unaware domain name
        slot (see Section 2 and [IDNA2008-Defs]), it MUST contain only
        ASCII characters (i.e., must be either an A-label or an LDH-
        label), or must be a label associated with a DNS application that
        is not subject to either IDNA or the historical recommendations
        for "hostname"-style names [RFC1034].


Section 4.2., paragraph 0:
OLD:

    The registry MAY permit submission of labels in A-label form.  If it
    does so, it SHOULD perform a conversion to a U-label, perform the
    steps and tests described below, and verify that the A-label produced
    by the step in Section 4.5 matches the one provided as input.  If,
    for some reason, it does not, the registration MUST be rejected.  If
    the conversion to a U-label is not performed, the registry MUST
    verify that the A-label is superficially valid, i.e., that it does
    not violate any of the rules of Punycode [RFC3492] encoding such as
    the prohibition on trailing hyphen-minus, appearance of non-basic
    characters before the delimiter, and so on.  Invalid strings that
    appear to be A-labels MUST NOT be placed in DNS zones.
    [[anchor9: Editorial: Should the sentences starting with "The
    registry" be moved to 4.3?  I.e., would they be more in sequence
    there?  Note that A-labels are, by definition, in ASCII, so section
    4.2 does not apply to them.  The tone of this recommendation also
    seems slightly at odds with the statements at the end of 4.2.
    Suggested text for cleaning this up, harmonizing it, and reducing
    redundancy would be appreciated.]]
 
 4.2.  Conversion to Unicode and Normalization

NEW:

 4.2.  Conversion to Unicode and Normalization


Section 4.2., paragraph 1:
OLD:

    Some system routine, or a localized front-end to the IDNA process,
    ensures that the proposed label is a Unicode string or converts it to
    one as appropriate.  That string MUST be in Unicode Normalization
    Form C (NFC [Unicode-UAX15]).

NEW:

    Some system routine, or a localized front-end to the IDNA process,
    ensures that the proposed label is a Unicode string or converts it to
    one as appropriate.  Independent of its source form, the string MUST
    be in Unicode Normalization Form C (NFC [Unicode-UAX15]) before
    further processing in this protocol.


Section 4.2., paragraph 2:
OLD:

    As a local implementation choice, the implementation MAY choose to
    map some forbidden characters to permitted characters (for instance
    mapping uppercase characters to lowercase ones), displaying the
    result to the user, and allowing processing to continue.  However, it
    is strongly recommended that, to avoid any possible ambiguity,
    entities responsible for zone files ("registries") accept
    registrations only for A-labels (to be converted to U-labels by the
    registry as discussed above) or U-labels actually produced from
    A-labels, not forms expected to be converted by some other process.

NEW:

    As a local implementation choice, the implementation MAY choose to
    map some forbidden characters to permitted characters (for instance
    mapping uppercase characters to lowercase ones), displaying the
    result to the user, and allowing processing to continue.  This should
    be done very conservatively to prevent interoperability problems
    lookup applications that do not follow exactly the same rules.  In
    particular, it is strongly recommended that, to avoid any possible
    ambiguity, entities responsible for zone files ("registries") accept
    registrations only for A-labels (to be converted to U-labels by the
    registry as discussed above) or U-labels actually produced from
    A-labels, not forms expected to be converted by some other process.


Section 4.3., paragraph 1:
OLD:

 4.3.1.  Rejection of Characters that are not Permitted

NEW:

 4.3.1.  Input Format
 
    [[anchor10: Note in -07 -- this section was formerly the second
    paragraph of Section 4.1.  It may need additional work; suggestions
    welcome.]]
 
    The registry MAY permit submission of labels in A-label form.  If it
    does so, it SHOULD perform a conversion to a U-label, perform the
    steps and tests described below, and verify that the A-label produced
    by the step in Section 4.5 matches the one provided as input.  If,
    for some reason, it does not, the registration MUST be rejected.  If
    the conversion to a U-label is not performed, the registry MUST
    verify that the A-label is superficially valid, i.e., that it does
    not violate any of the rules of Punycode [RFC3492] encoding such as
    the prohibition on trailing hyphen-minus, appearance of non-basic
    characters before the delimiter, and so on.  Invalid strings that
    appear to be A-labels MUST NOT be placed in DNS zones.
 
 4.3.2.  Rejection of Characters that are not Permitted


Section 4.3., paragraph 3:
OLD:

 4.3.2.  Label Validation

NEW:

 4.3.3.  Label Validation


Section 4.3., paragraph 5:
OLD:

 4.3.2.1.  Rejection of Confusing or Hostile Sequences in U-labels

NEW:

 4.3.3.1.  Rejection of Confusing or Hostile Sequences in U-labels


Section 4.3., paragraph 6:
OLD:

    The Unicode string MUST NOT contain "--" (two consecutive hyphens) in
    the third and fourth character positions.

NEW:

    The Unicode string MUST NOT contain "--" (two consecutive hyphens) in
    the third and fourth character positions when the label is considered
    in "on the wire" order.


Section 4.3., paragraph 7:
OLD:

 4.3.2.2.  Leading Combining Marks

NEW:

 4.3.3.2.  Leading Combining Marks


Section 4.3., paragraph 8:
OLD:

    The first character of the string is examined to verify that it is
    not a combining mark.  If it is a combining mark, the string MUST NOT
    be registered.

NEW:

    The first character of the string (when the label is considered in
    "on the wire" order) is examined to verify that it is not a combining
    mark.  If it is a combining mark, the string MUST NOT be registered.


Section 4.3., paragraph 9:
OLD:

 4.3.2.3.  Contextual Rules

NEW:

 4.3.3.3.  Contextual Rules


Section 4.3., paragraph 10:
OLD:

    Each code point is checked for its identification as a character
    requiring contextual processing for registration (the list of
    characters appears as the combination of CONTEXTJ and CONTEXTO in
    [IDNA2008-Tables] as do the contextual rules themselves).  If that
    indication appears, the table of contextual rules is checked for a
    rule for that character.  If no rule is found, the proposed label is
    rejected and MUST NOT be installed in a zone file.  If one is found,
    it is applied (typically as a test on the entire label or on adjacent
    characters within the label).  If the application of the rule does
    not conclude that the character is valid in context, the proposed
    label MUST BE rejected.  (See the IANA Considerations: IDNA Context
    Registry section of [IDNA2008-Tables].)
 
    These contextual rules are required to permit the use of characters
    that would otherwise risk causing considerable harm.  For example,
    labels containing invisible ("zero-width") characters may be
    permitted in context with characters whose presentation forms are
    significantly changed by the presence or absence of the zero-width
    characters, while other labels in which zero-width characters appear
    may be rejected.
    [[anchor14: Should this paragraph be removed?  Note that I've been
    strongly encouraged to supply specific examples to reduce abstraction
    and questions about the appropriateness of the text. -JcK]]

NEW:

    Each code point is checked for its identification as a character
    requiring contextual processing for registration (the list of
    characters appears as the combination of CONTEXTJ and CONTEXTO in
    [IDNA2008-Tables] as do the contextual rules themselves).  If that
    indication appears, the table of contextual rules is checked for a
    rule for that character.  If no rule is found, the proposed label is
    rejected and MUST NOT be installed in a zone file.  If one is found,
    it is applied (typically as a test on the entire label or on adjacent
    characters within the label).  If the application of the rule does
    not conclude that the character is valid in context, the proposed
    label MUST BE rejected.  (See the IANA Considerations: IDNA Context
    Registry section of [IDNA2008-Tables].)
    These contextual rules are required to permit the use of characters
    that would otherwise risk causing unacceptable ambiguity in label
    matching and interpretation.  For example, labels containing
    invisible ("zero-width") characters may be permitted in context with
    characters whose presentation forms are significantly changed by the
    presence or absence of the zero-width characters, while other labels
    in which zero-width characters appear may be rejected.


Section 4.3., paragraph 11:
OLD:

 4.3.2.4.  Labels Containing Characters Written Right to Left

NEW:

 4.3.3.4.  Labels Containing Characters Written Right to Left


Section 4.3., paragraph 13:
OLD:

 4.3.3.  Registration Validation Summary

NEW:

 4.3.4.  Registration Validation Summary


Section 5.5., paragraph 6:
OLD:

    o  Labels containing other code points that are shown in the
       permitted character table as requiring a contextual rule
       ("CONTEXTO" in the tables), but for which no such rule appears in
       the table of rules.  With the exception in the rule immediately
       above, applications resolving DNS names or carrying out equivalent
       operations are not required to test contextual rules, only to
       verify that a rule exists.

NEW:

    o  Labels containing other code points that are shown in the
       permitted character table as requiring a contextual rule
       ("CONTEXTO" in the tables), but for which no such rule appears in
       the table of rules.  Applications resolving DNS names or carrying
       out equivalent operations are not required to test contextual
       rules for "CONTEXTO" characters, only to verify that a rule exists
       (although they MAY make such tests to give better information to
       the user).


Section 5.5., paragraph 8:
OLD:

    In addition, the application SHOULD apply the following test.  The
    test may be omitted in special circumstances, such as when the lookup
    application knows that the conditions are enforced elsewhere, because
    an attempt to look up and resolve such strings will almost certainly
    lead to a DNS lookup failure except when wildcards are present in the
    zone.  However, applying the test is likely to give much better
    information about the reason for a lookup failure -- information that
    may be usefully passed to the user when that is feasible -- then DNS
    resolution failure information alone.  In any event, lookup
    applications should avoid attempting to resolve labels that are
    invalid under that test.

NEW:

    In addition, the application SHOULD apply the following test.  The
    test may be omitted in special circumstances, such as when the lookup
    application knows that the conditions are enforced elsewhere, because
    an attempt to look up and resolve such strings will almost certainly
    lead to a DNS lookup failure except when wildcards are present in the
    zone.  However, applying the test is likely to give much better
    information about the reason for a lookup failure -- information that
    may be usefully passed to the user when that is feasible -- than DNS
    resolution failure information alone.  In any event, lookup
    applications should avoid attempting to resolve labels that are
    invalid under that test.


Section 10., paragraph 2:
OLD:

    Specific textual changes were incorporated into this document after
    suggestions from the other contributors, Stephane Bortzmeyer, Mark
    Davis, Paul Hoffman, Kent Karlsson, Erik van der Poel, Marcos Sanz,
    Andrew Sullivan, Ken Whistler, and other WG participants.  Special
    thanks are due to Paul Hoffman for permission to extract material
    from his Internet-Draft to form the basis for Appendix A

NEW:

    Specific textual changes were incorporated into this document after
    suggestions from the other contributors, Stephane Bortzmeyer, Vint
    Cerf, Mark Davis, Paul Hoffman, Kent Karlsson, Erik van der Poel,
    Marcos Sanz, Andrew Sullivan, Ken Whistler, and other WG
    participants.  Special thanks are due to Paul Hoffman for permission
    to extract material from his Internet-Draft to form the basis for
    Appendix A


Appendix B., paragraph 28:
OLD:

 Author's Address

NEW:

 B.7.  Version -07
 
    o  Multiple small textual and editorial changes and clarifications.
 
    o  Requirement for normalization clarified to apply to all cases and
       conditions for preprocessing further clarified.
 
 Author's Address
-------------- next part --------------
*** Defs 01 to 02 ***

 Internationalized Domain Names for Applications (IDNA): Definitions and
                            Document Framework

Section 1.1.1., paragraph 1:
OLD:

    While many IETF specifications are directed exclusively to protocol
    implementers, the character of IDNA requires that it be understood
    and properly used by those whose responsibilities include making
    decisions about what names are permitted in DNS zone files and about
    policies related to names and naming.  This document and those
    concerned with the protocol definition, rules for rules for handling
    strings that include characters written right-to-left, and the actual
    list of characters and categories will be of primary interest to
    protocol implementers.  This document and the one containing
    explanatory material will be of primary interest to others, although
    they may have to fill in details of interest by reference to other
    documents in the set.

NEW:

    While many IETF specifications are directed exclusively to protocol
    implementers, the character of IDNA requires that it be understood
    and properly used by those whose responsibilities include making
    decisions about what names are permitted in DNS zone files and about
    policies related to names and naming.  This document and those
    concerned with the protocol definition, rules for handling strings
    that include characters written right-to-left, and the actual list of
    characters and categories will be of primary interest to protocol
    implementers.  This document and the one containing explanatory
    material will be of primary interest to others, although they may
    have to fill in details of interest by reference to other documents
    in the set.


Section 2.1., paragraph 1:
OLD:

    [[anchor8: Formerly Section 1.5.2 of Rationale-03]]
 
    A code point is an integer value associated with a character in a
    coded character set.

NEW:

    A code point is an integer value associated with a character in a
    coded character set.


Section 2.1., paragraph 3:
OLD:

    ASCII means US-ASCII [ASCII], a coded character set containing 128
    characters associated with code points in the range 0000..007F.
    Unicode may be thought of as an extension of ASCII; it includes all
    the ASCII characters and associates them with equivalent code points.

NEW:

    ASCII means US-ASCII [ASCII], a coded character set containing 128
    characters associated with code points in the range 0000..007F.
    Unicode may be thought of as a generalization of ASCII; it includes
    all the ASCII characters and associates them with equivalent code
    points.


Section 2.2., paragraph 1:
OLD:

    [[anchor10: Formerly Section 1.5.3 of Rationale-03.]]
 
    When discussing the DNS, this document generally assumes the
    terminology used in the DNS specifications [RFC1034] [RFC1035].  The
    term "lookup" is used to describe the combination of operations
    performed by the IDNA2008 protocol and those actually performed by a
    DNS resolver.  The process of placing an entry into the DNS is
    referred to as "registration", similar to common contemporary usage
    in other contexts.  Consequently, any DNS zone administration is
    described as a "registry", regardless of the actual administrative
    arrangements or level in the DNS tree.  More detail about that
    relationship is included in the "Rationale" document.

NEW:

    When discussing the DNS, this document generally assumes the
    terminology used in the DNS specifications [RFC1034] [RFC1035].  The
    term "lookup" is used to describe the combination of operations
    performed by the IDNA2008 protocol and those actually performed by a
    DNS resolver.  The process of placing an entry into the DNS is
    referred to as "registration", similar to common contemporary usage
    in other contexts.  Consequently, any DNS zone administration is
    described as a "registry", regardless of the actual administrative
    arrangements or level in the DNS tree.  More detail about that
    relationship is included in the "Rationale" document.


Section 2.3., paragraph 1:
OLD:

    [[anchor11: Formerly Section 1.5.4 of Rationale-03 with some material
    removed and left in that document.]]
 
    This section defines some terminology to reduce dependence on terms
    and definitions that have been problematic in the past.

NEW:

    This section defines some terminology to reduce dependence on terms
    and definitions that have been problematic in the past.


Section 2.3.1.6., paragraph 2:
OLD:

    An "IDN-aware domain name slot" is defined in this document to be a
    domain name slot explicitly designated for carrying an
    internationalized domain name as defined in this document.  The
    designation may be static (for example, in the specification of the
    protocol or interface) or dynamic (for example, as a result of
    negotiation in an interactive session).

NEW:

    An "IDN-aware domain name slot" is defined for this set of documents
    to be a domain name slot explicitly designated for carrying an
    internationalized domain name as defined in this document.  The
    designation may be static (for example, in the specification of the
    protocol or interface) or dynamic (for example, as a result of
    negotiation in an interactive session).


Section 2.3.1.6., paragraph 3:
OLD:

    An "IDN-unaware domain name slot" is defined in this document to be
    any domain name slot that is not an IDN-aware domain name slot.
    Obviously, this includes any domain name slot whose specification
    predates IDNA.

NEW:

    An "IDN-unaware domain name slot" is defined for this set of
    documents to be any domain name slot that is not an IDN-aware domain
    name slot.  Obviously, this includes any domain name slot whose
    specification predates IDNA.


Section 2.3.1.6., paragraph 4:
OLD:

 2.3.2.  Punycode is an Algorithm, not a Name or Adjective

NEW:

 2.3.2.  Order of Characters in Labels


Section 2.3.1.6., paragraph 5:
OLD:

    [[anchor17: Formerly Section 1.5.5 of Rationale-03.]]

NEW:

    Because IDN labels may contain characters that are read, and
    preferentially displayed, from right to left, there is a potential
    ambiguity about which character in a label is "first".  For the
    purposes of these specifications, labels are considered, and
    characters numbered, strictly in the order in which they appear "on
    the wire".  That order is equivalent to the leftmost character being
    treated as first in a label that is read left-to-right and to the
    righmost character being first in a label that is read right-to-left.
    The "Bidi" specification contains additional discussion of the
    conditions that influence reading order.
 
 2.3.3.  Punycode is an Algorithm, not a Name or Adjective


Section 5., paragraph 2:
OLD:

    Specific textual suggestions after the extraction process came from
    Bill McQuillan.

NEW:

    Specific textual suggestions after the extraction process came from
    Vint Cerf and Bill McQuillan.


Appendix A., paragraph 8:
OLD:

 Author's Address

NEW:

 A.3.  Version -02
 
    o  All back pointers to section numbers in Rationale have been
       removed.
 
    o  Some definitions clarified.  Added one about string order.
 
 Author's Address



More information about the Idna-update mailing list