Problems deciding if az- should have multiple registrations...
Addison Phillips [wM]
aphillips at webmethods.com
Tue Apr 15 10:48:30 CEST 2003
Peter_Constable at sil.org wrote:
>>>You want us to have RFC3066(bis) tags to distinguish character set
>>Oh heavens no! By no means is that what I meant.
> Phew! Glad to hear that. But, in that case, I must have missed your point
> regarding ja_JP and sv_SE -- you need the longer forms to get UTF-8, but
> what's the bearing on our current discussion? (Perhaps the answer is
> partially in your following text.)
Some people may need the longer forms to actually activate their
system's locale mechanism (that is, the language tagging scheme for
their system resources): .NET I believe doesn't allow you to directly
instantiate a "language-only" RegionInfo (I'm a little fuzzy about the
details of that).
> So, let me see if I understand where you're going. Depending on what we
> have in mind by "mappings", perhaps systems don't need to use RFC3066(bis)
> tags like th-TH, but we define mappings specific to various host
> implementation environments that tell us things like "for Solaris 2.8, if
> language = 'th' and desired encoding is UTF-8, substitute 'th-TH'".
Not *WE*, but rather vendors need to define the mapping for their
particular implementation. I think the best we can do is suggest rules
for mapping a language ID to local mechanisms. I'm not as concerned
about existing irregularities as I am about getting script-encoding
things to map correctly. Adding an additional level of complexity to the
language tags decreases their ambiguity, but requires rather more care
when interpreting them.
By way of example, one might define that "zh-Hans" maps to "zh_CN" or
"zh__Simplified" in Java (the latter being hypothetical), not to "zh"
More information about the Ietf-languages