Not changing the prefix

Erik van der Poel erikv at google.com
Sat Feb 28 17:31:44 CET 2009


On Fri, Feb 27, 2009 at 10:24 PM, John C Klensin <klensin at jck.com> wrote:
> --On Friday, February 27, 2009 07:58 -0800 Erik van der Poel
> <erikv at google.com> wrote:
>> OK, but using a different prefix for all scripts and languages
>> in IDNA2008 would just be silly. Why would we disrupt all of
>> the scripts and languages that are served well by IDNA2003 and
>> xn--?
>>
>> Note that Greek is not served well by IDNA2003, as Vaggelis has
>> explained many times (tonos issue).
>
> Erik,
>
> I understand, I think, what you are suggesting for Greek --where
> Greek is defined as a label consisting only of characters from
> the Greek script, located in a subtree of the .GR domain or some
> other domain that has a major focus on Greek characters.
>
> But, remembering that we've decided that any rules restricting
> mixed-script labels are a registry problem, please consider the
> following possible labels --labels that make sense and that are
> not visually confusing --
>    μvolt
>    sevenσ   (or perhaps sevenς or even 7ς )
> Given the use of Greek characters in various mathematical and
> engineering contexts, I'm sure that we could generate many such
> examples that would not be subject to standard Greek
> orthographical rules because they aren't even close to being
> Greek words.
>
> But those two examples are particularly interesting because,
> while "μvolt" would be generally recognized around the world,
> Μvolt (that is U+039C, not U+004D) would not make any sense,
> mapping or no mapping.   Similarly, the "Seven Sigma" folks
> really would expect to see "sevenσ" or "sevenΣ" and not
> "sevenς", which would probably make no sense to most of them.

True.

> Now assume that labels of that general type --a Greek character
> or two in an otherwise-Latin, or other script, label -- appear
> in a gTLD or some other zone that is not Greek-specific.  Would
> you still use a different prefix and...

I realized that we wouldn't need a new prefix for Greek, because they
want more characters mapped (to remove tonos), not less characters
mapped. (The problematic characters that would need a new prefix are
Eszett, ZWJ and ZWNJ, because we do not want those to be mapped.)

My original reason for suggesting that implementations use different
mappings, depending on the TLD, was because we need some way for the
implementation to decide which mapping to use when the user is typing
on the keyboard.

However, a TLD-specific mapping approach does not scale well. If many
TLDs and nearly top-level domains start demanding their own mappings,
it would quickly get out of hand.

I'm not sure whether there are any other ways for an implementation to
decide which mapping to use for keyboard-typed names. Although it
might be reasonable for the implementation to apply Turkish 'i'
case-mappings when the user interface itself is being presented in
Turkish, if we did such a thing for Greek tonos, users of other
languages would not get the same mapping even if they knew how to
enter Greek with tonos.

>        (i) How would you expect it to be interpreted?
>
>        (ii) Would you expect an application to look up the new
>        prefix first (with the new encoding rules), the old
>        ("xn--") prefix first, both of them, or something else?
>
> Before you answer, please assume that Greek is not the only
> situation where special mapping or display rules will be wanted
> if they are possible.   Jefsey has suggested that French is
> another example.  A number of the "decoration-optional" scripts
> and usages might be, including Arabic, Vietnamese, and Pinyin,
> even though some of those can be handled, if sub-optimally, in
> other ways.  Mark's note suggests that some of the Indic scripts
> might be candidates, possibly with or without ZWJ/ZWNJ.
>
> It is an unfortunate property of this process that people
> "representing" scripts and who actively participate on the
> mailing list are more likely to get special consideration for
> their issues than those who are more silent or actually absent.
> But we need to be cautious to avoid trapping ourselves into
> believing that those who are not speaking up have no problems or
> issues... or will have none when they are more on the Internet.

True.

Erik


More information about the Idna-update mailing list