suppress-script values for fil, mi, pes, prs, qu members

Peter Constable petercon at microsoft.com
Wed Oct 20 20:43:51 CEST 2010


From: johnwcowan at gmail.com [mailto:johnwcowan at gmail.com] On Behalf Of John Cowan

On Wed, Oct 20, 2010 at 11:30 AM, Peter Constable <petercon at microsoft.com> wrote:

>> We are working on product implementations that are impacted by this.
>> I raise the Quechua cases because Windows is localized into Cusco 
>> Quechua, hence Quechua is a case that we need to support; and because 
>> tags without script subtags have been used.

> It's a bad assumption that languages without S-S information must be 
> tagged by tags with script information.  Unless you are going to localize 
> into more than one script, I see no point in adding Suppress-Script: 
> information to a non-639-1 language; its purpose is to tell you the 
> default script when there is one.  If there is no actual script issue, as 
> with the great majority of written languages, then you just choose to 
> use appropriate language tags (without script) as locale names.

If there isn't a script issue, then indeed language tags without script subtags should be completely reasonable. But how is an implementer to know when that's the case? Your second sentence, "Unless...", doesn't make sense to me. Let me restate without as many negatives: "If a language uses only one script, there is no point in adding s-s information to ISO 639-2/-3 languages; its purpose is to tell you the default script when there is one." The logical argument you're trying to make completely escapes me. If a language clearly uses only one script (in common scenarios), it seems to me it would be helpful for implementers to know that, and that s-s is a convenient way to do that.


> Languages with 639-1 codes are a different matter, because they were 
> valid before RFC 4646, 

As were 639-2 codes

> which would mean that en-Latn-US would not 
> match en-US, and therefore the former should be avoided for the sake 
> of compatibility with old matchers.  Remember that S-S didn't exist in 
> our original drafts of 4646, and was imposed on us by the IETF for the 
> sake of backward compatibility.

And qu-PE or qu-Latn-PE are valid but would not match; and the same is also true for quz-PE and quz-Latn-PE: pre-4646 habits of not using script subtags have persisted beyond publication of 4646. In fact, you have here suggested that script subtags don't need to be included in tags for mono-script 639-2/-3 languages, but that only feeds this problem--unless people have some way to know when script subtags can be dispensed with. Adding s-s fields in the clear cases as we come across them is the obvious way to achieve that. 

Otherwise, perhaps we should be freezing s-s fields and recommending that tags _always_ include script subtags except for the grandfathered cases for which we have s-s fields.



Peter


> (It's kind of like having encoded characters for capital and small 
> letter foo with macron, and then having decomposition mappings for the 
> latter to a small foo + comb macron but not having any capital foo and 
> no decomposition mapping for the former: there's a gap in the paradigm 
> that implementations might trip over.)'

I don't understand this comparison.



More information about the Ietf-languages mailing list