Unicode 5.2 -> 6.0

Mark Davis ☕ mark at macchiato.com
Thu Oct 14 23:01:39 CEST 2010


I agree with John; there are millions of registries, not just hundreds.

====

In terms of Patrik's proposed disposition, I agree 2/3 with him. There are
two distinct issues.

*1) DISALLOWED => PVALID*
*
*

*U+0CF1 KANNADA SIGN JIHVAMULIYA*
*U+0CF2 KANNADA SIGN UPADHMANIYA*


These don't cause any problem. With each new version of Unicode, this
happens with thousands of characters; all the new ones. Having IDNA2008 just
follow Unicode is the right thing to do.


*2) PVALID => **DISALLOWED*
*
*

*U+19DA NEW TAI LUE THAM DIGIT ONE*

*
*
Patrik mentioned having a 'bug' in Unicode show up in IDNA2008. But what
qualifies as a bug depends on the usage. It was a bug in the Unicode
properties that the NEW TAI LUE THAM DIGIT ONE was categorized as a decimal
number. For general usage, that is indeed a significant bug, and needed to
be fixed. But the world of identifiers (including IDNA2008) is much much
narrower. Would it be a significant bug to have it be grandfathered in
domain names? If someone has an NEW TAI LUE THAM DIGIT ONE  in a domain name
does that cause any real problem?

My answer is *no*. It will not cause any problems in domain names, no more
than the average character that is already allowed in domain names (such as
the many historic characters). The choice of exactly which characters are in
domain names is, as we all know, somewhat fuzzy. The decision to generally
disallow non-decimal numbers in domain names (
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=[:no:][:nl:]%26[:isnfkc:]&g=sc)
is perfectly reasonable, but it doesn't hurt to leave one of them (NEW TAI
LUE THAM DIGIT ONE) in via Clause G, in order to preserve stability.

The stability of domain names is far more important -- that *once a domain
name is valid, that it remain so*. We know that there is a significant break
before and after Aug 2010, but that should be the last such break. And we
have the mechanism in Clause G to to guarantee that, if there is the will to
do so.

Mark

*— Il meglio è l’inimico del bene —*


2010/10/14 John C Klensin <klensin at jck.com>

>
>
> --On Thursday, October 14, 2010 13:22 -0700 Tina Dam
> <tina.dam at icann.org> wrote:
>
> > Hi Patrick, thanks for this. It was sooner than I expected...
> >
> > In terms of the forward progress, I agree with your
> > recommendation for the specific example. Generally I think in
> > order to choose between the options (accept change or add
> > exception for backward compatibility) I would personally like
> > to hear from registries that have implemented the character
> > and who may be disadvantaged by it's elimination.
>
> Tina, I know you know this but, just as it is sometimes easy to
> lose sight of the fact that the DNS is not just about "web
> addresses", it is sometimes easy to lose track of the depth and
> extent of the tree.  The meaning of "registry" in IDNA2008 (and
> similar terminology in IDNA2003) extends to every zone
> administrator of a zone in the DNS.  In principle, the
> administrator of a subdomain somewhere deep in the DNS tree
> could have chosen to utilize one or more of these characters
> without having to discuss that decision with anyone else.
>
> While I believe that the probability of that having occurred is
> actually very low, any sort of decision process associated with
> "ask the registries" or "hear from the registries" has most of
> the properties of a search for a universal negative: if someone
> comes forward and says "yes, I am using one of those" it gives
> us a lot of information but that absence of such an answer tells
> us almost nothing... and, in this day of restrictions on zone
> transfers, there is no feasible way to walk the entire tree.
>
> > For this specific case, I am personally unaware of any IDNs
> > with the New Tai Lue script. However, we should probably make
> > some sort of process where the gTLD registries and ccTLD
> > registries are asked to provide relevant input if/when this
> > happens again.
>
> While I think some mechanism of that sort would be a good idea,
> we also need to remember that, while many people are thinking
> about IDN TLDs in terms of script-homogeneous or
> language-homogeneous trees and FQDNs, there is, in general, no
> plausible way to enforce such requirements.  Even if one could
> do that for IDN TLDs and their subtrees, it is not possible for
> subtrees of existing, non-IDN, TLDs.
>
> best,
>    john
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20101014/e4353061/attachment-0001.html>


More information about the Idna-update mailing list