Not changing the prefix

John C Klensin klensin at jck.com
Sat Feb 28 07:24:57 CET 2009



--On Friday, February 27, 2009 07:58 -0800 Erik van der Poel
<erikv at google.com> wrote:

> OK, but using a different prefix for all scripts and languages
> in IDNA2008 would just be silly. Why would we disrupt all of
> the scripts and languages that are served well by IDNA2003 and
> xn--?
> 
> Note that Greek is not served well by IDNA2003, as Vaggelis has
> explained many times (tonos issue).

Erik,

I understand, I think, what you are suggesting for Greek --where
Greek is defined as a label consisting only of characters from
the Greek script, located in a subtree of the .GR domain or some
other domain that has a major focus on Greek characters.

But, remembering that we've decided that any rules restricting
mixed-script labels are a registry problem, please consider the
following possible labels --labels that make sense and that are
not visually confusing --
    μvolt
    sevenσ   (or perhaps sevenς or even 7ς )
Given the use of Greek characters in various mathematical and
engineering contexts, I'm sure that we could generate many such
examples that would not be subject to standard Greek
orthographical rules because they aren't even close to being
Greek words.

But those two examples are particularly interesting because,
while "μvolt" would be generally recognized around the world,
Μvolt (that is U+039C, not U+004D) would not make any sense,
mapping or no mapping.   Similarly, the "Seven Sigma" folks
really would expect to see "sevenσ" or "sevenΣ" and not
"sevenς", which would probably make no sense to most of them.

Now assume that labels of that general type --a Greek character
or two in an otherwise-Latin, or other script, label -- appear
in a gTLD or some other zone that is not Greek-specific.  Would
you still use a different prefix and...

	(i) How would you expect it to be interpreted?
	
	(ii) Would you expect an application to look up the new
	prefix first (with the new encoding rules), the old
	("xn--") prefix first, both of them, or something else?

Before you answer, please assume that Greek is not the only
situation where special mapping or display rules will be wanted
if they are possible.   Jefsey has suggested that French is
another example.  A number of the "decoration-optional" scripts
and usages might be, including Arabic, Vietnamese, and Pinyin,
even though some of those can be handled, if sub-optimally, in
other ways.  Mark's note suggests that some of the Indic scripts
might be candidates, possibly with or without ZWJ/ZWNJ.

It is an unfortunate property of this process that people
"representing" scripts and who actively participate on the
mailing list are more likely to get special consideration for
their issues than those who are more silent or actually absent.
But we need to be cautious to avoid trapping ourselves into
believing that those who are not speaking up have no problems or
issues... or will have none when they are more on the Internet.

      john



More information about the Idna-update mailing list