Here's what I have to say aboutthat?

Kenneth Whistler kenw at sybase.com
Tue May 27 20:01:30 CEST 2003


Michael,

> James Seng told me that Singapore uses Hans, but that its use of Hans 
> is different than that used in CN. So zh-Hans doesn't work for 
> Singapore. So, how do we deal with this? 

James provided some clarification, but here is some more.

zh-Hans could be used, generically, to indicate the same as
expressed by "Simplified Chinese". That is the machine
distinction that people are trying to represent here. Realistically,
this has, in the past, meant the distinction between data
represented in Code Page 936 (PRC and Singapore) and Code Page 950
(Taiwan and Hong Kong). That distinction is breaking down
now with use of Unicode and GB 18030, which work for *both*
the simplified and traditional orthographies (in their local
variants), but the labels are needed simply to be able to
mark data conventionally for the legacy distinctions.

If you needed to explicitly call out a Singaporean variant
of simplified Chinese, e.g., to ensure the engagement of
a terminological dictionary to check for appropriate lexical
items that might differ between the PRC and Singapore (and
other fine details of character usage), then zh-SG would
serve fine. It doesn't have to be a 3-level tag to accomplish
that.

> And what about zh-hakka, in 
> Hant or Hans?

Either. It's an orthogonal distinction. zh-hakka, if what
you are talking about is the Hakka language per se. zh-Hans, if
you are talking about written data (which happens to be
Hakka) printed in the PRC. zh-Hant, if you are talking
about written data (which happens to be Hakka) printed in
Taiwan.

> 
> >Also: since Mark isn't asking you for any tags with country codes, 
> >why is this an issue *now*? It would be reasonable to reject 
> >zh-hant-CN (or zh-CN-hant, if you prefer) for this reason, but not 
> >the nine in question.
> 
> Because the Tag Reviewer needs to have rules that can be applied 
> generally, and it is easy to see that zh- has been extended, and that 
> if script codes are added, there is a syntactic element which needs 
> to be addressed.

As stated, the requests are for either/or labels, not for resolving
the problem of 3-level extensions using both, and what they might
mean.

The need here is just for enough labels to distinguish what needs
to be distinguished, not for promulgation of a coherent theory
of how country and script tag extensions can all be made to
coexist.

--Ken 



More information about the Ietf-languages mailing list