FYI: Extending IDNA to other protocols (Nick Teint)
patrik at frobbit.se
Wed Mar 24 17:43:27 CET 2010
I think a document that talk about this is the IAB I-D about IDN encodings (draft-iab-idn-encoding). I think personally that moving forward is hard, but doable, and hope the IAB document will result in more explicit work items for the IETF.
What I am nervous about is that people do mix up the relatively difficult pre-processing that is needed for Unicode based identifiers with how they are encoded, and that people think that if we just get punycode go away, they also do not have to implement any of the normalizations, can compare strings byte by byte etc.
So at the moment I think the fact we have punycode is a good thing, simply because it forces people to actually look at what the whole IDN issue include.
But long term, get rid of punycode, absolutely.
Btw, one more thing, I think punycode also solves one more thing. IF the current IDN solutions are really really crap, and we have to throw them away, we have only polluted the DNS namespace with some labels here and there that start with "xn--". If we had done things with UTF-8, the damage would have been MUCH worse. Being able to move to "version 2" is a good thing.
On 24 mar 2010, at 09.23, Shawn Steele wrote:
> IDNA2003 has terrible support throughout the system. Browsers are aware of it and sort-of work, but tons of other stuff is broken.
> Part of the reason is that “everything” has a chance to muck with Domain Names at all sorts of layers. There’s actually a lot of code that was Unicode-aware, and UTF-8 DNS even worked in some systems. However now all that stuff is broken even though it has nothing to do with DNS, just because it might get punycode or might get Unicode and has no clue what form a DNS label might appear in. And it just gets worse everywhere.
> Transitions like this are painful. In IDN’s attempt to make them less painful, instead they’re more painful. Maybe some older DNS server owners are happy, but I’ve got a lot of other unhappy places ☹ That includes unhappy DNS servers that handled UTF-8 prior to IDNA2003 and now somehow they have to reconcile the disparity.
> I’m under no illusion that EAI adoption will be trivial, but at least it’ll be somewhat controlled.
> From: Vint Cerf [mailto:vint at google.com]
> Sent: Poʻakolu, Malaki 24, 2010 8:57 AM
> To: Shawn Steele
> Cc: idna-update at alvestrand.no
> Subject: Re: FYI: Extending IDNA to other protocols (Nick Teint)
> this requires a server change. good luck.
> On Wed, Mar 24, 2010 at 11:50 AM, Shawn Steele <Shawn.Steele at microsoft.com<mailto:Shawn.Steele at microsoft.com>> wrote:
> I much prefer the EAI method of using UTF-8 instead of the punycode hack. (http://www.ietf.org/dyn/wg/charter/eai-charter). Indeed several vendors already seem to be working on EAI solutions.
> For one thing, punycode has proven that it clutters the layers of an application and leads to terrible confusion about when an IDN name moves from Unicode to Punycode, requiring that the application layer have a deep understanding of DNS. It'd be much better to "fix" the protocols to make them comply with RFC 2279 "Protocols MUST be able to use the UTF-8 charset", rather than provide hacks.
> Date: Tue, 23 Mar 2010 21:16:53 +0100
> From: Nick Teint <nick.teint at googlemail.com<mailto:nick.teint at googlemail.com>>
> Subject: FYI: Extending IDNA to other protocols
> To: idna-update at alvestrand.no<mailto:idna-update at alvestrand.no>
> <7dabd4501003231316p2fd9ad24g385b5479af0a6c6 at mail.gmail.com<mailto:7dabd4501003231316p2fd9ad24g385b5479af0a6c6 at mail.gmail.com>>
> Content-Type: text/plain; charset=ISO-8859-1
> Today, I've submitted several Internet-Drafts describing a proposed
> framework to use IDNA(bis) for non-domain addresses.
> The basic idea is to extract anything from the address that fits the
> syntax of a valid domain name "label", i.e. strings that roughly match
> the "LDH" syntax for "A-labels" and "U-labels". The extracted strings
> are then converted using a conversion very similar to IDNAbis.
> The draft for the base is:
> Examples for profiles:
> Idna-update mailing list
> Idna-update at alvestrand.no<mailto:Idna-update at alvestrand.no>
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20100324/b55d301b/attachment.pgp
More information about the Idna-update