Problems deciding if az- should have multiple registrations...

Peter_Constable at sil.org Peter_Constable at sil.org
Mon Apr 14 23:24:09 CEST 2003


Addison Phillips wrote on 04/14/2003 07:50:03 PM:

> The problem is that existing generative rules allow for th-TH, I believe.

Yes, they do. So, this and others like it don't need to be registered to
use them. But, I'm suggesting we might explicitly register them anyway,
*if* they are used in legacy systems with some distinctive meaning, and
*if* we want to provide some kind of backward compatibility with some
legacy locale stuff (and *if* the given case isn't handled by some other
"mapping" mechanism -- see below) -- we register them and document exactly
why they might ever be used, because on the surface they certainly look
like they'd be unnecessary (as, except for the legacy usage, they would
be).



> > You want us to have RFC3066(bis) tags to distinguish character set
> > encodings?
>
> Oh heavens no! By no means is that what I meant.

Phew! Glad to hear that. But, in that case, I must have missed your point
regarding ja_JP and sv_SE -- you need the longer forms to get UTF-8, but
what's the bearing on our current discussion? (Perhaps the answer is
partially in your following text.)


> The point is that calling setlocale with "ja.UTF8" doesn't work. I can
> generate the appropriate Unicode-encoded locale on some systems only by
> filling out the region tag. That's my point. Any not a hypothetical one.
I
> have a small C program that does collation and it tries hard to convert
the
> HTTP requested language into a UTF-8 locale for LC_COLLATE. This works
more
> reliably if I have a full language tag. Again, this is possibly a mapping
> problem.

So, let me see if I understand where you're going. Depending on what we
have in mind by "mappings", perhaps systems don't need to use RFC3066(bis)
tags like th-TH, but we define mappings specific to various host
implementation environments that tell us things like "for Solaris 2.8, if
language = 'th' and desired encoding is UTF-8, substitute 'th-TH'".



- Peter


---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485





More information about the Ietf-languages mailing list