Wikipedia tagging

Mark Davis ☕ mark at macchiato.com
Mon Jul 29 20:56:27 CEST 2013


In particular, in CLDR (and ICU) we use the 'shortest form' as the
canonical form for language tagging. So as Markus says, we tag content for
"en-Latn-US" with "en", and "zh-Hans-CN" (or "cmn-Hans-CN") as simply "zh".

One clarification. When looking up resources, we have a matching process
that takes list of desired user languages, and finds the best match in the
list of supported languages. So if the user's desired languages are "ja-JP,
zh-CN", and the supported languages are "zh, en", then the match would be
"zh".


Mark <https://plus.google.com/114199149796022210033>
*
*
*— Il meglio è l’inimico del bene —*
**


On Mon, Jul 29, 2013 at 1:03 AM, Markus Scherer <markus.icu at gmail.com>wrote:

> On Sun, Jul 28, 2013 at 2:00 PM, Peter Constable <petercon at microsoft.com>wrote:
>
>>  Can anyone explain why it is that Wikipedia is tagging Simplified
>> Chinese content as simply “zh”? ****
>>
>> **[...]**
>>
>> That’s not good practice and will lead to users encountering buggy
>> behaviour (e.g., content being displayed using the wrong fonts).
>>
>
> I think it is very common that developers assume that "zh" defaults to
> "zh-Hans-CN", just like they assume that "en" defaults to "en-Latn-US",
> "de" to "de-Latn-DE", etc.
>
> markus
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/ietf-languages/attachments/20130729/63e80a26/attachment.html>


More information about the Ietf-languages mailing list