Minutes from the Tuesday, March 23rd, 2009 IDNAbis WG meeting

Erik van der Poel erikv at google.com
Mon Apr 6 18:52:12 CEST 2009

I would also like to thank Eric (and Andrew) for taking notes. It's a
lot of work and makes it harder to participate in the discussion

One of my comments was missed by the scribe. I was just asking Patrik
to elaborate on the XMPP issue, and I was asking people to elaborate
on the "include mapping in lookup but exclude mapping from


On Mon, Apr 6, 2009 at 9:26 AM, Mark Davis <mark at macchiato.com> wrote:
> Thanks to Eric as well; taking notes in that kind of meeting is very hard.
> For the one case where what I said was "missed by scribe", my response was
> something like:
> There are different kinds of "suspicious" domain names; for example, a
> company like Google keeps list of known spoofing sites. However, that kind
> of suspicious domain name has nothing to do with the character content of
> the URL, nor with the choice of characters to allow in domain names. The
> only type of "suspicion" that is of concern for us is visually
> confusability.
> Mark
> On Mon, Apr 6, 2009 at 05:25, Vint Cerf <vint at google.com> wrote:
>> thanks to eric for taking notes
>> Vint Cerf
>> Google
>> 1818 Library Street, Suite 400
>> Reston, VA 20190
>> 202-370-5637
>> vint at google.com
>> Begin forwarded message:
>> From: Eric Brunner-Williams <ebw at abenaki.wabanaki.net>
>> Date: April 5, 2009 4:27:33 PM EDT
>> To: Vint Cerf <vint at google.com>
>> Cc: ebw at abenaki.wabanaki.net
>> Subject: Minutes from the Tuesday, March 23rd, 2009 IDNAbis WG meeting
>> Minutes from the Tuesday, March 24th, 2009 IDNAbis WG meeting, scribe Eric
>> Brunner-Williams (ebw at abenaki.wabanaki.net). Transcribed from notes,
>> supplemented with Andrew Sullivan's jabber log.
>> The meeting began at 9am with blue sheets and agenda modification.
>> Patrick Faltstrom (PAF) continued the prior day's discussion of
>> casefolding, commented that Stuart had checked the {b u-umlaut} vs {B
>> u-umlaut} case (no change)
>> Paul Hoffman (PH) responded that the a-umlaut case was correct (change)
>> Agenda modification Paul Hoffman's IDNAv2 proposal for 30min, followed by
>> questions.
>> PH - IDNAv2 presentation, in brief, explain what the difference relative
>> to IDNA2008, see slides (Pauls). Different mapping solution (2008 is no
>> mapping)
>> PAF - "more complicated than that", difference over what "mapping means"
>> PH - map in registration vs map in UI, suggest we tighten definition of
>> "map"
>> Mark Davis (MD) - clarification (repeated later) that "protection" (from
>> "abusive registration") is not simply via registry policy, but also via the
>> client implementation, followed by a discussion of browwer display of
>> "suspicious characters".
>> Vint Cerf (VC) - browsers are not the only applications that will use IDNs
>> so human processing (visual simularity, "looking") isn't the only test
>> MD - suspicious is homoglyphs, so "looking" is the test
>> John Klensin (JK) - suspicious is bad registry policy, and/or bad font
>> and/or bad character set and/or bad content experience
>> MD - rejoinder (missed by scribe)
>> Thomas Narton (TN) - is there agreement that ICANN can't go ahead? if we
>> avoid edge cases ...?
>> PH - yes, avoid edge
>> TN - avoid edge in leading rounds
>> Cary Karp (CK) - not the central issue, ICANN needs to enter into
>> contracts, no one is prepared to agree to conformance with protocol not
>> written so if delay then IDNA2003 goes into those contracts
>> Harald Alvestrand (HA) - there is no difference (between 2003 and 2008)
>> for the set of labels ICANN has announced, there are no edge cases
>> Eric Brunner-Williams (EBW) - not all characters are known yet
>> HA - not all submissions have been made public
>> Tina Dam (TD) - public strings are announced, but not all chars (other
>> than TLD labels) disclosed (this is a reference to the SLD and subordinate
>> chars, scribe)
>> End of the IDNAv2 presentation.
>> VC - slides, discussion of stringprep, nameprep and upgrading, as well as
>> occurrence elsewhere, e.g., kerberos for password matching
>> JK - procedural issues with "this updates stringprep", update issue
>> discussion
>> HA - when do we discuss substantive issues? Two points of violent
>> objection: (1) it retains all the chars valid in '03 including snowman,
>> stupid mistake, and (2) treatment of bidi, current bidi spec fails to
>> achieve goals, AN, EN issue "simple bidi fix is too simple", and has other
>> stylistic objections.
>> MD - agrees 08 bidi much better than 03 bidi, disagree snowman is stupid,
>> overblown objection, doesn't hurt much to take it out, square-root (dot) com
>> example (some mac fan), neutral on eliminating symbols.
>> PH - agree, not violently, about (heart) (removing it, scribe)
>> PAF - more violent disagreement with IDNAv2, (1) still table based, not
>> rules, why not use rules? same ending discussion so same time to complete.
>> no comments (to current -table draft in five months. problem in jabber/xmpp
>> stored input (pre-map form), and (2) definition of a-labels and u-labels,
>> but maybe has been added to v2. Problem with *prep -- no understanding of
>> difference between the character that is mapped to something and then
>> stored, so people have stored the pre-mapped character, mapping was big
>> mistake in idna2003. New mapping suggestion for idna2008 is just a help to
>> applications.
>> Leslie Daigle (LD) - timeliness is a red herring, v2 plus one or two
>> things won't work
>> JK - IDNAv2 will be slower (than IDNA2008) and the difference between
>> table based and rule based is significant, as is exclusion vs inclusion
>> MD - process for how context rules change a concern, lack of mappings is a
>> concern, argues for mapping only for lookup, not for registration, 2003
>> compatibility with some exceptional characters
>> VC - ZWJ? ZWNJ??
>> MD - see UTR/46
>> PAF - re: mapping, excellent example why WG moves slowly, first draft of a
>> lookup mapping written some time ago, if important, do something about it
>> PH - not fair
>> MD - was told a mapping draft would not be welcome, there is a text in the
>> UTC list, mentions context rules and need for implementation
>> VC - moving on, Jamos permitted under IDNAv2 (Jamo slide), removed under
>> 08
>> PH - irrelevant under 2003 because they're mapped, in non-mapping
>> protocol, must get rid of. Not true in mapping protocol
>> Dowon Kim (DK) - strongly recommend Jamo not allowed in 08
>> MD - disagree with PH, jamo in syllables, but leading jamo (old hangul
>> chars)
>> PH - only for ancient syllables
>> VC - important to satisfy Koreans, not get tangled again
>> MD - two (not three) possibilities: (1) jamo can occur under '03, (2) v2
>> model, registry and clients to ban
>> VC - the "extra dots" code points
>> DK- Korean internet community doesn't want Jamo allowed under v2
>> (Andrew Sullivan's jabber log ends)
>> VC - observation slides (0), (1) -- canonical representation may be
>> useful, canonical IDNs slide
>> MD - ambiguity is the wrong term, in 03 a unicode label turns uniqeuly to
>> a domain name, the reverse path is where there is an ambiguity, e.g., final
>> sigma. (2) can preserve symmetry between U and A labels but not the
>> assumption that only the canonical form is stored or used (unicode urls in
>> email user expectation)
>> VC / MD - JK's work on what a domain slot in discussion of local mapping
>> (M-label)
>> PH - vehemently agree w/PAF, mapping has to be very carefully handled,
>> very precise where and what kind of mapping we're talking about
>> CK - (1) IDN guidelines could or should go into a BCP document, (2) the
>> .gr registry would greatly appreciate it if people who aren't registry
>> operators working in greek would stop representing .gr's issues and
>> positions (yeah!!! scribe)
>> PAF - rules agree with MD?
>> JK - (something i didn't catch, scribe)
>> DK(?) - mapping is essential
>> James Seng (JS) - half-width, full-width mapping has to be done somewhere
>> VC - if non-canonical forms are not exported ...
>> JS - non-IDN-aware A-labels, IDN-awere U-labels, so for IDN-unaware,
>> A-label is canonical, for IDN-aware, U-label is canonical
>> MD - end up with soup transporting Unicode and A-labels, reference to TR46
>> on mapping
>> HA - disagree with MD, wild variety of non-sensical representations not as
>> useful as cononical form, IETF should define these canonical forms. appendix
>> reference (to MD's TR46 reference, supra, scribe)
>> Dave Crocker (DC) - canonical form offers architectural advantages, agree
>> with HA
>> Ucido(?) - CJK mapping makes canonical U-label
>> Eric van der Pool (EvdP) - wants U-label restricted to SMTP, but for web,
>> no restriction, then combine best ideas of both 08 and v2
>> JK - tradeoffs ... worried about interoperability, many more IDNs in
>> things we've not seen in 5+ years, variant forms make for more potential
>> risk, discourage old (03) stuff
>> VC - discusses 08 -> 03 dual lookup (in appendix), IRI/URI at
>> protocol-specification level
>> MD - there's "reasonable" and "unreasonable", but people think these
>> things are reasonable: u-umlaut, full-width, etc. -- this won't go away,
>> local mappings would be a disaster.
>> Ted Hardie (TH) - in Bill Manning's absence in the role of the "Bad Idea
>> Fairy" (1) if xn-- and xo-- are a bad idea (two theories with a signal), why
>> is two theories without a signal (the "compromise" mentioned between 03 and
>> 08, above) not just as bad (or worse)?
>> EvdP - (something i missed, scribe)
>> PAF - xmpp, jabber retain identifiers, case independent matching, mentions
>> Pete Resnik's invention of stringprep -- what's pain now, not 2bn users
>> later (for the xmpp user base)
>> JK - multiple forms a bad idea
>> EvdP - new protocol jabber/xmpp, very old protocol DNS, between them is
>> the "web world" and it is not this group's (idnabis) decision what they (the
>> web world) do
>> ----------
>> break
>> ----------
>> VC - test a thory - value for canonicalization, poses question "do you
>> disagree with canonical forms as useful?"
>> MD - mentions steps to a canonical form (CF), refine and identify
>> canonical definition form
>> VC - if we adopt that (CF as useful), what else should we do, if anything,
>> transition?
>>   (a) accept the pain soon, strict canonical
>> or (b) what else not increassing pain?
>> or (c) 08 +v2 anyone?
>> or (d) v2 anyone?
>> EvdP - strongly object to not being allowed to pick good features from
>> both 08 and v2, if web community goes off and does its mappings (DENIC
>> reference) it could be bad
>> VC - asks no discussion of specific chars (the DENIC reference)
>> MD - if there is a mapping step in 08, that has a dramatic effect on
>> eszett, ZWNJ, etc. if we have to choose right now between two, hard choice
>> VC - if unassigned, unallocated codepoints aren't looked up, then what is
>> needed?
>> HA - three choices:
>>   (a) 08 as is, mapping informative in an appendix
>> or (b) 08 with mapping required on lookup
>> of (c) v2. nobody seems to what to change bidi (08), protocol, tables,
>> etc, the status of mapping in the protocol is the sticking point. details
>> later, pick one
>> VC - if pursue mapping & incorp. in 2008 structure, is that acceptable?
>>   q1: how many people would object to moving ahead with 2008 as currently
>> stands?
>>   q2: use 2008 as basis, incorporate mapping (on lookup only, scribe)?
>>   q3: use v2?
>> TH - mentions the difference between normative rule and optional appendix
>> HA - current documents have an appendix
>> scribe vote count: four object to (a), more object to (c)
>> VC - ok, (b), how to overcome limitations of IDNA2008 as it stands?
>> PH - add a non-optional non-appendix, something to tell people what to do
>> with inputs
>> JS - suggesting a BCP (separate document)
>> PAF - don't care if separate document, appendix, but need mapping, if
>> mapping it MUST ... don't want fluffy definition
>> VC - mapping during lookup, not registration, using the canonical form
>> MD - agree, but needs to be a requirement, for loop somewhere in 5.2 to
>> 5.4 (protocol doc section refs, scribe), but in 4.x (ditto) MUST NOT map in
>> registration, store in U-label form
>> Pete Resnick (PR) - pass
>> VC - is there consensus for IDNA2008 and try to include non-optional
>> (lookup only) mapping?
>> MD - and to include a SHOULD for storage?
>> VC - yes
>> JK - permanent or transitional?
>> VC- transitional for the purposes of backward compatibility
>> PH - need time
>> VC - there is consensus to proceed on this (08 + lookup mapping)
>> PAF - unhappy with mapping is a MUST
>> Meeting ends at 11am. Blue sheet reminder and collection.
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

More information about the Idna-update mailing list