Comments on IDNA Bidi

JFCM jefsey at jefsey.com
Fri Jan 18 16:10:03 CET 2008


At 00:42 18/01/2008, John C Klensin wrote:
>--On Thursday, 17 January, 2008 18:31 +0100 JFCM
><jefsey at jefsey.com> wrote:
>
> > At 15:54 17/01/2008, John C Klensin wrote:
> >> The short answer is that it is not realistic, in the general
> >> case,  to impose restrictions on one label based on the
> >> contents of another.
> >
> > Except if you use classes triggered by an included code.
> > This would not be backward compatible with current DNS (what
> > would protect from confusion), but in the case of an IDN class
> > it could be OK  fo a bidi additional class.
>
>Jefsey,
>You are wasting your time and ours.  If you want to pursue an
>entirely different approach, as one of your earlier notes
>indicated that you intended to do, by all means do so.  But, if
>you are going to make suggestions here, please familiarize
>yourself sufficiently with the DNS that they are plausible.

Dear John,
I understand what you really object to and why you play the DNS 
ignorance tune. I weighted that.

However, we have an IDNA problem lagging for 8 years, and an 
associated multilanguage support problem for 4 years. They are on the 
verge to disrupt the Internet stability. We cannot continue to stall 
and argue about complicated/complex blocking details for ever no one 
will understand and trust, at least among operators. IDNA has to be 
helped and IDNB investigated. After all, you first initiated the use 
of classes with ICANN which took it into consideration. I tested what 
I could call virtual classes. Time has come to move. Language names 
are not the only names in town that the network naming must support. 
We need to have a simple, clear, of the shelves way to support that 
kind of needs.

IDNA faces one single main problem: the lack of internet presentation 
layer. The only convenient way I see (as documented and experimented 
in their own ways by severals) is to use externets (external network 
lookalike within the network - open walled garden could be an 
Internet wording). Externets are usually implemented in using user 
classes and host groups (or a constrained mix as in OSI Closed-User Groups).

IDNA faces a second set of problems which are specific to Unicode. 
IMHO this prevents IDNA from being a lasting universal solution, but 
not from being a solution set now, in parallel with the current DNS, 
and futher on with my suggested IDNB and probably other solutions. 
RFC 1958.3.1. "Heterogeneity is inevitable and must be supported by design.".

The difficulty IDNA and Unicode meet comes from trying to address 
everything in one shot.  In such a case, why not to go by the book, 
and help the Unicode people through in conforming with RFC 1958. 3.5 
"Keep it simple. When in doubt during design, choose the simplest 
solution." and 3.6. "Modularity is good. If you can keep things 
separate, do so." Old recipes make the good diners.

First problem we have, and the world does not care, is that we want 
to support all the scripts in the same way, while we have an existing 
solution which only supports ASCII characters, one single character 
set. We want to extend, when we should multiply. No one will have any 
problem with a script by script support - no more phishing, probably 
no more babelnames. No one will have any problem if some scripts need 
bidi - that scripts will be identified and supported as such. There 
are less than 100 scripts, this means less than 100 script specific 
classes. I have no doubt punycode can tell the script it processed, 
and deny multiscripts names, and assign the class.

>Even with an alternate class approach, which is definitely not
>on the table at the moment, one cannot, in general, tied the
>interpretation of, or matching rules for, one label to the
>content of another.

Correct. However, this is not what I said. I say that one label can 
include a script class indicator, committing the whole DN - or 
blocking it if not coherent. This is very simple: every Unicode code 
tells that off-the-shelves. The ccTLD tables help for authorised 
character set in a script. I do not like CLDR much, but I see no real 
problem in CLDR supporting the necessary tables. If Mark Davis 
dislikes the idea (he did not want to discuss IDNs in LTRU) there is 
no problem in putting them in the netlocale files we will use for 
other purposes (cctags).

>Independent of whether they might be of
>any use in dealing with IDNs, DNS Classes are very well defined
>and neither "included code" nor inter-label dependencies fit
>into that definition.

Yes, but we do not discuss the class definitions or properties, just 
the way they are triggered.

Where you are right is that classes (which are not supported by 
browsers) are not on the table. However, we have a problem on the 
table that - I may be wrong or too early - one cannot solve, IMHO, 
without using them.

So, the question is to know how to support classes (and possibly 
groups) with an existing network system which does not support them. 
IMHO the solution is to accept the heterachical nature of the naming 
and to proceed accordingly. The interest is that it would make 
Unicode based IDNA, IDNB, X.500, RFIDs, semantic addressing, etc. 
equally supported without affecting the URI/IRI. IDNA wants to make 
it at application level, I prefer investigate it at network level, 
but the forking is the same.

>Some of the folks who are interested in "language domains" keep
>making the error of believing this as well.  While one could
>imagine policies that would keep an entire domain tree
>homogeneous with respect to language, they would still not be
>able to affect the interpretation of those labels

Please, remember, you were the first to underline that we do not 
speak of languages (they are at the "intersem" - the semantic layers 
above) but of scripts at internet layers. This gives Unicode all the 
flexibility they want, even in adapting some of the scripts 
behaviours to the Internet.

>  (in addition to being very difficult to enforce).

Remark: this is based upon the "specification patch" that if I do not 
read/write chinese, I do not care about resolving chinese script URL 
(this is consistent with Mark Davis language/script filtering RFC 
4647). So, I do not say it addresses all the cases; but it unlocks 
the current situation until a script independent IDNB solution can be 
worked out.
jfc



More information about the Idna-update mailing list