Consensus Call Tranche 8 (Character Adjustments)

Tue Oct 14 12:47:33 CEST 2008

>
> Consensus Call Tranche 8 (character adjustments)
>

NO.

COMMENTS:
>

A YES vote would represent a significant security problem, and slow the
development of IDNA2008 significantly. There are two distinct issues wrapped
up in this tranche.

1. As for the conjoining Hangul characters, these are used in representing
non-modern Hangul characters. The committee has had a long-standing
consensus for *not* going character by character through each script to
determine which are the modern-use characters and which are not. We do not
need to reopen this issue.

If this change is made, then that would force us to rethink that policy,
potentially bogging us down in protracted analyses of the different scripts
to exclude non-modern use characters, such as
U+01BF <http://unicode.org/cldr/utility/character.jsp?a=01BF> ( ƿ ) LATIN
LETTER WYNN
U+16B9 <http://unicode.org/cldr/utility/character.jsp?a=16B9> ( ᚹ ) RUNIC
LETTER WUNJO WYNN W
and many, many others.

2. While the desire for ß and ς characters is understandable, there are
problems with compatibility. Until they are upgraded, which will require
some period of time, implementations will be supporting IDNA2003 and not
IDNA2008. And for compatibility, for the foreseeable future, even
implementations that support IDNA2008 will need to also support IDNA2003.

In most cases the differences between these are tractable, for companies
like my own. URL X may be valid in IDNA2003 and not IDNA2008 or vice versa,
but it never goes to two different locations. These two characters would
break that. URL X could go to two *different* locations, depending which
standard is being supported.

If I send someone große.com <http://grosse.com> in an email, then depending
on what tools the user uses to read that email, it could end up at
grosse.com (a legitimate site) or große.com <http://grosse.com> (a spoof
site). (Or, of course, große.com <http://grosse.com> could be the legitimate
site and grosse.com the spoof site.) This represents a significant security
problem.

Sigma is fundamentally a presentation issue: it should be displayed as ς if
it is final. An alternative approach would be to add a SHOULD that it be so
displayed.

Eszett is slightly trickier. Yet its use in German orthography is not
fundamentally required, as evidenced by the fact that it is not used in High
German within in Switzerland, with no apparent ill effects on the population
(see, for example, http://www.nzz.ch/). And the recommended usage of ss vs ß
changed substantially in the latest, not-wholly successful, German spelling
reforms. As a percentage of words in use, especially when weighted by usage,
the number that are distinguished by ss vs ß are vanishingly small.

As stated in rationale-03:

   They [DNS 'names']are typically derived from, or rooted in, some
   language because most people think in language-based ways.  But,
   because they are mnemonics, they need not obey the orthographic
   conventions of any language: it is not a requirement that it be
   possible for them to be "words".

   This distinction is important because the reasonable goal of an IDN
   effort is not to be able to write the great Klingon (or language of
   one's choice) novel in DNS labels but to be able to form a usefully
   broad range of mnemonics in ways that are as natural as possible in a

   very broad range of scripts.

Thus while recognizing the legitimate desire of people to use ß and
ς characters, the cost in terms of compatibility and security does not
appear to be worth the gain. It is thus too early for consensus on these.

Instead, those wanting to make this change should propose some mechanisms
for avoiding the security problems -- only if those can be overcome in a
reasonable fashion could we incorporate this change, allowing ß and ς.

>
> Procedure:
>
>
> There are several decisions that the working group will need to make to
> confirm consensus.  I will send a series of proposals over the next two
> weeks requesting YES or NO positions on each within a 4 day window. If NO is
> the response, a reason for that position needs to be stated. If there is a
> clear consensus based on responses or in the absence of a consensus against
> each proposal, it will be assumed that the proposal is acceptable to the
> Working Group.
>
>
> Parenthesized symbols (e.g., "(R.1)") after the items are references to the
> issues lists where additional explanations can be found, as sent by John
> Klensin as body parts "idnabis-protocol-issues-rev3" and
> "idnabis-rationale-issues-03" on a message titled 'Issues lists and the
> "preprocessing" topic'  to the working group on 18 August (
> http://www.alvestrand.no/pipermail/idna-update/2008-August/002537.html)
>
> This group needs to get its documents out; it is behind its original
> schedule. It should be noted that the IDN ccTLD and gTLD selection
> initiatives at ICANN have already begun so that delay may weaken the IETF's
> ability to assist in a rational deployment of IDNA.
>
>
>
> (8) Specific character adjustments for IDNA2003 -> IDNA2008
> differences.
>
> (8.a) Make Eszett Protocol-Valid per list discussion.
>
> (8.b) Make Greek final sigma Protocol-Valid per list
> discussion.
>
> (8.c) Disallow conjoining Hangul jamo per recommendation from
> KRNIC and others, permitting only precomposed syllables.
>
>
>
>
> NOTE NEW BUSINESS ADDRESS AND PHONE
> Vint Cerf
> Google
> 1818 Library Street, Suite 400
> Reston, VA 20190
> 202-370-5637
> vint at google.com
>
>
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20081014/450f7e2c/attachment.htm