Erik van der Poel
erikv at google.com
Fri May 9 01:01:12 CEST 2008
An unassigned codepoint may be assigned to an uppercase letter. So a
piece of software that looks up purported U-labels must check whether
it contains any unassigned codepoints. So we should recommend that
such software be restricted (follow certain rules), in order to
achieve interoperability. (MSIE7 refuses to look up domain names
containing unassigned characters.)
If we lock down the DISALLOWED set too tightly, we may regret it
later. One way to avoid locking it down is to recommend that
burned-in-ROM and other unupgradable software only use protocols that
use LDH- and A-labels. All pieces of software, whether IDNA-aware or
not, are explicitly permitted to look up Punycode labels, without
decoding them to check for DISALLOWED, CONTEXT*, etc.
On the other hand, we should recommend that protocol and application
developers only use U-labels if they are willing to make their
software upgradable. They need to do this for unassigned codepoints
anyway. So we might as well allow for the possibility of moving some
characters from DISALLOWED to other categories (if and when we
determine that they should be moved, having come up with better
criteria for use in IDNs, more information, clamoring users, etc).
If we allow for this possibility, we don't need to fret so much about
historic scripts right now. Just dump them in DISALLOWED for now, and
deal with them later, if they ever need to be dealt with.
On Thu, May 8, 2008 at 1:59 PM, Shawn Steele <Shawn.Steele at microsoft.com> wrote:
> Erik wrote:
> > This also neatly solves the problem of whether or not IDNA-unaware and
> > IDNA-aware clients are allowed to look up labels with Punycode in
> > them. They should always be allowed to do so. Only software that tries
> > to convert from U-labels to A-labels needs to be restricted. This is
> > how we can achieve the most reasonable level of interoperability, in
> > my opinion.
> I think that conversion U to A conversion does NOT need restriction. Assuming that the steps in conversion include NFKC or appropriate mappings, then if a character moves from disallowed to allowed, the conversion is already known. So no change is required for lookup, even if conversion is required. The only change would be the software the decides the legality of the name, which, IMO could be at a different layer.
> - Shawn
> Idna-update mailing list
> Idna-update at alvestrand.no
More information about the Idna-update