idna-bis and 'ß'

Thomas Roessler roessler at does-not-exist.org
Tue Nov 27 00:06:16 CET 2007


On 2007-11-26 08:03:38 -0500, John C Klensin wrote:

> Put differently, to the extent to which IRIs specify a user
> interface behavior, it would be perfectly reasonable for the
> IRI spec to specify that SHARP S should be mapped to some
> other character or character sequence ("ss" by the
> orthography rules of some German-speaking countries, "fs" by
> appearance, "??s" (U+017F U+0073) by origin, etc.
> Certainly, if it is to be mapped to anything but itself,
> that needs to be specified.  

That's, as you wrote in your earlier message, in fact only a
smaller part of the Grand Plan to get rid of mappings.  While I
sympathize with that plan, I worry that it might break
references to domain names in existing documents (read: Web
pages) -- in a place that doesn't really qualify as a user
interface.

While, in theory, it sounds attractive to finally treat the
Turkish dotless i (and similar peculiarities) reasonably by
dealing with them in a place where there is superior knowledge,
it would appear that at least in the Web use case non-ASCII
domain names will be processed in places where that knowledge
has already been lost (i.e., the user's browser when it hits
notepad-generated HTML content).  Even worse, the author's and
the user's browser might not be interoperating when it comes to
interpreting IRI references in content.

Effectively, this would seem to imply that (much of) the
nameprep mapping niceties would have to move from the IDN spec
to the IRI spec and other specifications layered on top of it.

> But it should not be an IDNA problem, especially since IRIs
> might choose to map it differently in different contexts (I
> don't need to remind either of you that tails are
> case-sensitive so the IDNA2003 rules don't apply).

More significantly, "interesting" processing of tails happens
close to places where they were authored, so much of the
concern goes away for that part of the URI anyway.

On 2007-11-26 09:44:30 -0500, John C Klensin wrote:

>> Has anyone started working on iri-bis? Is there a draft?

> There is a draft but, if I correct recollect the content of
> a recent note from Martin, the most recent version has not
> been posted.  

Skimming through the latest published draft
(draft-draft-duerst-iri-bis-01), I notice that it simply refers
to ToASCII as far as IDNs are concerned.  For normalization, it
relies on Nameprep.

> My personal prediction is that it may be in for an
> "interesting" ride: there still seem to be ambiguities about
> the degree to which it is a user interface spec versus
> something to be used on the wire, some of the script
> communities that are very different from European ones don't
> see it as an adequate solution to their problems, etc.  I
> think all of those issues could be addressed with a very
> clear scope statement, but we may have problems agreeing on
> such a statement.

-- 
Thomas Roessler			      <roessler at does-not-exist.org>


More information about the Idna-update mailing list