fon* variants

Mark Davis mark.davis at icu-project.org
Sat Dec 9 21:20:27 CET 2006


A few corrections. - MD

On 12/8/06, Doug Ewell <dewell at adelphia.net> wrote:
>
> Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:
>
> > I still don't like any "generic" variants, and think that extension
> > registries are a better approach.  On the other hand it's hard to to
> > develop a proper extension from scratch, so maybe experimenting with
> > fon* variants for now is a good thing.  Until somebody has the time to
> > identify rules for a future "f" extension, deprecating the registered
> > fon* variants.
>
> Michael is right on this one.  Variants like "western" applied to the
> "Western" version of different languages would violate Section 3.5
> ("change the semantic meaning") and should not be accepted.  IPA is


I disagree -- there is no general consensus on that; it was just silly to
resort to constructed terms in a foreign language to avoid having useful,
productive variants. (Might as well have had esternWay and easternYay.)

different; by design it can be applied to virtually any spoken language.
> (Note that this is not true for SAMPA, one of the many mappings of IPA
> onto ASCII; it is language-dependent.)
>
> I've pretty much given up on extensions.  The language tag people (OK,
> John Cowan) say they are for non-linguistic information, but it seems
> unlikely to me that the non-language tag people will go to the effort of
> writing an RFC and getting it through IETF, and setting up a mailing
> list.  They'll probably do what they have always done, create their own
> syntax.  Even ICU has create the ersatz variants "revised" and "posix"


The history is a bit off here. "revised" and "posix" predated 4646 by quite
some time. The Unicode CLDR project was at its most recent version, V1.4, on
2006-07-16, while RFC 4646 was only finally approved afterwards, in
September 2006.

Moreover, the Unicode CLDR project has been moving towards changing these to
be 4646 codes in LDML, as variants get encoded that can handle them (even
if, like polytonic, suboptimally). This has been communicated in several
emails on LTRU. The only remaining outlying case is POSIX, which we didn't
think the ietf-languages group would buy off on. (It basically means using
"neutral" terms corresponding to usage in computer languages). If someone
where to come up with a good way to replace that with a 4646 variant tag, I
think the CLDR group would be all ears.

In a few cases, we use private use codes for cases where 4646 is
insufficient. Those are documented in
http://unicode.org/reports/tr35/#Identifiers, and available in
machine-readable form. We also clarify which codes are to be used for
unknown or invalid codes (subtags).

There is also a transformation of CLDR locales into conformant 4646 tags
(and back): see
http://unicode.org/reports/tr35/#Identifiers . We do intends to file for an
extension, basically so that we can use something like -u- instead of
-x-ldml

The most problematic case remaining for international identifiers is
currencies, which ISO documents and maintains badly, and doesn't guarantee
stability. ZWD for example, stands for two different currencies, which
differ by a factor of 1000!!!!

instead of trying to register the former as a variant or create an
> extension for the latter.
>
> --
> Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
> http://users.adelphia.net/~dewell/
> http://www1.ietf.org/html.charters/ltru-charter.html
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20061209/bd6ad29a/attachment.html


More information about the Ietf-languages mailing list