Visually confusable characters (8)

Sun Aug 10 22:35:18 CEST 2014

On 8/9/2014 10:48 AM, John C Klensin wrote:
>

John,

I thought it best to reply to your points individually, as some
branches of the discussion are probably not going to be as deep.

As you wrote them "in no particular order", I'm going to respond
to them in the same way.

This message responds to point (8)

A./
>
>
> (8) Despite all of the above, it should be possible to invent
> and enforce some wide-ranging "additional protocol" that would
> understand human perceptions about homographs, know what was in
> the DNS at all levels and all trees (or at least all relevant
> ones, where "relevant" certainly extends below the second level)
> and that they could be used to enforce some sort of "variant" or
> "similarity prohibition" rules on registrations and delegations
> at potentially deep levels of the tree, enforcing those rules on
> both types of aliases and maybe web redirects as well as
> delegated subdomains and the host-type records in them.
>
> Nice fantasy.  DNS doesn't work that way, the Internet doesn't
> either, and that is probably A Good Thing.  For example, except
> when required by contract or regulation, we no longer allow
> people to obtain a list of the labels allocated in or delegated
> from a particular domain.  Once one gets away from the root and
> TLD contents, the contractual requirements are very rare.   Some
> of us consider the general inability to ask such questions to be
> an important privacy mechanism as well as having some
> performance and operational advantages.  One can find out is a
> name is already associated with DNS records, but that test is
> not completely reliable due to various race conditions, hidden
> domains and subdomains, etc.
>
> Moreover, the reasons for an administratively-distributed
> hierarchy aren't just to spread the workload around.  It it to
> allow different parts of the domain tree to have different
> policies.  A "one size fits all" naming model doesn't help.   In
> addition, which I have some concerned about the desirability and
> workability of some of Jefsey's ideas, it is clear to me that
> even the most reasonable of them depend heavily of being able to
> have different naming conventions and experiences to match
> different user (or user group or nationality) preferences or
> requirements.  A global "additional protocol" with its own
> naming rules would probably make that impossible even if it were
> otherwise feasible.
>
>

John,

as of today, confusables like "rn" for "m" and a host of others ranging
from near homographs to "arm's-length/small-type" confusables are
not addresses by either repertoire, context or normalization rules built
into IDNA2008.

Yet they are being addressed; and by necessity, as you describe, this
has to be done zone by zone. The results may differ.

I read Jefsey's ruminations as an attempt to identify approaches that
would improve the results and robustness of these as much as it
is feasible in a distributed manner.

"Solving" partial issues up front in IDNA 2008 is certainly attractive,
but undercuts the ability of different zones to have different rules.
I could imagine that the Fula issue is one that is in a class that would
collectively benefit from being addressed (properly) on the top levels
of the DNS, but I can't see that a blanket prohibition would work well
on subdomains for entities that are local to that language area.

If this code point was the only case in an otherwise clean system,
it would draw much less argument. But the selection seems ad-hoc,
and the cost of excluding a particular language community from
full participation appears to be high (in political terms), while
users in other languages (i.e. in the Arabic language) are rather
unlikely to be affected (the sequence appears limited to Koranic
uses).

I can't help the impression that this case is an ad-hoc elevation of
a single arbitrary instance of what are otherwise several doze
edge cases, some of which might quite well be worse by an
order of magnitude.

A./