Comments on IDNA Bidi
jefsey at jefsey.com
Fri Jan 18 16:10:03 CET 2008
At 00:42 18/01/2008, John C Klensin wrote:
>--On Thursday, 17 January, 2008 18:31 +0100 JFCM
><jefsey at jefsey.com> wrote:
> > At 15:54 17/01/2008, John C Klensin wrote:
> >> The short answer is that it is not realistic, in the general
> >> case, to impose restrictions on one label based on the
> >> contents of another.
> > Except if you use classes triggered by an included code.
> > This would not be backward compatible with current DNS (what
> > would protect from confusion), but in the case of an IDN class
> > it could be OK fo a bidi additional class.
>You are wasting your time and ours. If you want to pursue an
>entirely different approach, as one of your earlier notes
>indicated that you intended to do, by all means do so. But, if
>you are going to make suggestions here, please familiarize
>yourself sufficiently with the DNS that they are plausible.
I understand what you really object to and why you play the DNS
ignorance tune. I weighted that.
However, we have an IDNA problem lagging for 8 years, and an
associated multilanguage support problem for 4 years. They are on the
verge to disrupt the Internet stability. We cannot continue to stall
and argue about complicated/complex blocking details for ever no one
will understand and trust, at least among operators. IDNA has to be
helped and IDNB investigated. After all, you first initiated the use
of classes with ICANN which took it into consideration. I tested what
I could call virtual classes. Time has come to move. Language names
are not the only names in town that the network naming must support.
We need to have a simple, clear, of the shelves way to support that
kind of needs.
IDNA faces one single main problem: the lack of internet presentation
layer. The only convenient way I see (as documented and experimented
in their own ways by severals) is to use externets (external network
lookalike within the network - open walled garden could be an
Internet wording). Externets are usually implemented in using user
classes and host groups (or a constrained mix as in OSI Closed-User Groups).
IDNA faces a second set of problems which are specific to Unicode.
IMHO this prevents IDNA from being a lasting universal solution, but
not from being a solution set now, in parallel with the current DNS,
and futher on with my suggested IDNB and probably other solutions.
RFC 1958.3.1. "Heterogeneity is inevitable and must be supported by design.".
The difficulty IDNA and Unicode meet comes from trying to address
everything in one shot. In such a case, why not to go by the book,
and help the Unicode people through in conforming with RFC 1958. 3.5
"Keep it simple. When in doubt during design, choose the simplest
solution." and 3.6. "Modularity is good. If you can keep things
separate, do so." Old recipes make the good diners.
First problem we have, and the world does not care, is that we want
to support all the scripts in the same way, while we have an existing
solution which only supports ASCII characters, one single character
set. We want to extend, when we should multiply. No one will have any
problem with a script by script support - no more phishing, probably
no more babelnames. No one will have any problem if some scripts need
bidi - that scripts will be identified and supported as such. There
are less than 100 scripts, this means less than 100 script specific
classes. I have no doubt punycode can tell the script it processed,
and deny multiscripts names, and assign the class.
>Even with an alternate class approach, which is definitely not
>on the table at the moment, one cannot, in general, tied the
>interpretation of, or matching rules for, one label to the
>content of another.
Correct. However, this is not what I said. I say that one label can
include a script class indicator, committing the whole DN - or
blocking it if not coherent. This is very simple: every Unicode code
tells that off-the-shelves. The ccTLD tables help for authorised
character set in a script. I do not like CLDR much, but I see no real
problem in CLDR supporting the necessary tables. If Mark Davis
dislikes the idea (he did not want to discuss IDNs in LTRU) there is
no problem in putting them in the netlocale files we will use for
other purposes (cctags).
>Independent of whether they might be of
>any use in dealing with IDNs, DNS Classes are very well defined
>and neither "included code" nor inter-label dependencies fit
>into that definition.
Yes, but we do not discuss the class definitions or properties, just
the way they are triggered.
Where you are right is that classes (which are not supported by
browsers) are not on the table. However, we have a problem on the
table that - I may be wrong or too early - one cannot solve, IMHO,
without using them.
So, the question is to know how to support classes (and possibly
groups) with an existing network system which does not support them.
IMHO the solution is to accept the heterachical nature of the naming
and to proceed accordingly. The interest is that it would make
Unicode based IDNA, IDNB, X.500, RFIDs, semantic addressing, etc.
equally supported without affecting the URI/IRI. IDNA wants to make
it at application level, I prefer investigate it at network level,
but the forking is the same.
>Some of the folks who are interested in "language domains" keep
>making the error of believing this as well. While one could
>imagine policies that would keep an entire domain tree
>homogeneous with respect to language, they would still not be
>able to affect the interpretation of those labels
Please, remember, you were the first to underline that we do not
speak of languages (they are at the "intersem" - the semantic layers
above) but of scripts at internet layers. This gives Unicode all the
flexibility they want, even in adapting some of the scripts
behaviours to the Internet.
> (in addition to being very difficult to enforce).
Remark: this is based upon the "specification patch" that if I do not
read/write chinese, I do not care about resolving chinese script URL
(this is consistent with Mark Davis language/script filtering RFC
4647). So, I do not say it addresses all the cases; but it unlocks
the current situation until a script independent IDNB solution can be
More information about the Idna-update