LANGUAGE TAG REGISTRATION FORM
Harald Tveit Alvestrand
harald at alvestrand.no
Fri Apr 11 19:23:11 CEST 2003
--On fredag, april 11, 2003 09:03:54 -0700 Mark Davis
<mark.davis at jtcsv.com> wrote:
> I can understand your concern. Well, among languages we need to make at
> least the distinctions that Windows and others make; we have to be able to
> interwork with major platforms.
so one of your goals is round-trip loss-free translation between Microsoft
Windows' language tagging system and the ICU language tagging system. Right?
What are the "others" you mention?
(hmmm.... I see the equivalent of Unicode's compatibility characters and a
need for language tag normalization down that road.... I don't like it, but
can see why we could need it....)
> If the end goal is to extend 3066bis to
> permit the equivalent of:
> 5. <iso_639_code> "-" <iso_15924_code>
> 6. <iso_639_code> "-" <iso_15924_code> "-" <iso_3166_code>
> then it does no harm to have the additional registrations. If we can only
> get az-Cryl and az-Latn registered, or if the end goal for 3066bis will
> not permit both #5 and #6, then we would probably be forced to define our
> language codes as "based on" RFC 3066, but not identical.
Thanks - what I'm trying to understand is the shape of the forcing function.
> The registrations proposed are only the tip of the iceberg; eventually we
> could need up to something like the following list (where * means each of
> the various scripts used with the spoken language):
> for zh-*: HK, MO, CN, SG, TW, US,...
> for az-*: AZ, IR, ...
> for uz-*: AF, KZ, KG, TJ, TM, UZ, ...
> for sr-*: YU, BA, MK, HR, ...
> which is why a generative mechanism is much simpler.
it's definitely simpler for the producer of the codes. It may or may not be
simpler for their consumer; Unicode's ability to preserve round-trip
translations was one thing that engendered the concept of "unnormalized"
Unicode text; that certainly did not make life simpler for the consumers of
More information about the Ietf-languages