draft-phillips-langtags-08 script subtags and matching

Peter Constable petercon at microsoft.com
Sun Jan 2 10:05:35 CET 2005


> From: Tex Texin [mailto:tex at xencraft.com]

> It is a different thing entirely to change a standard in a way that
> requires
> admins to choose unnecessarily between existing and new formats that
> result in
> different user experiences.

A site admin does not have to change anything just because there's a new
RFC that offers more choices. If someone has content tagged right now as
"sr-CS", then there's nothing compelling them to change. Not unless they
aren't happy with the experience they're providing for users. I would
expect them to change from "sr-CS" to "sr-Latn-CS" only if users were
unhappy getting Latin content when they requested "sr" or "sr-CS", or if
users were starting to request "sr-Latn" or "sr-Latn-CS" because they
weren't happy getting Cyrillic.

If most content tagged "sr-CS" right now is in Cyrillic script (I don't
know if that's the case) and that's what people asking for "sr" or
"sr-CS" expect, then there's not a problem. If lot's of people start
requesting content explicitly in Latin or Cyrillic script, then a site
admin would likely change to follow suit.

Nobody's going to change their sites from e.g. "en-US" to "en-Latn-US".
This is only an issue in those cases, like Serbian, in which two scripts
are used. In those cases, nothing *has* to change, but there's certainly
a likelihood that script choices are a significant concern. The proposed
revision offers choices to deall with those needs in the way that we
think best serves users; if they want to change the tagging of their
content, they can.

The only situation that can lead to significant problems is if a site
admin has a problem that makes them decide to change to "sr-Cyrl-CS" or
"sr-Latn-CS" and they still have a lot of users requesting "sr-CS" not
finding the content they want. But, they were having problems before --
that's why they decided to change.

Of course, this leads to your objection above, and you'd say that they
could have changed to "sr-CS-Cyrl" and fixed their problems without
creating other problems for the people looking for "sr-CS". But let me
ask: are there really likely to be people asking for "sr-CS"? Serbian is
written within Serbia and Montenegro in both scripts -- Cyrllic by the
Serb majority, and Latin among minorities. I don't know for sure, but I
would guess that people care more about which script they get their
content in than whether the content used the CS dialect or spelling (if
there are even differences). If someone asks for "sr-CS", they have no
idea what script they'll get.

> Is there a good reason for script to be secondary?

Yes. Script distinctions are generally going to matter to users more
than dialect or spelling distinctions, and left-prefix matching will
provide better results for users using the order lang-script-region.


Peter Constable


More information about the Ietf-languages mailing list