mark.davis at icu-project.org
Thu Sep 6 21:21:35 CEST 2007
I agree that samples wouldn't help; what we need are authoritative documents
describing usage in order to establish that there is or isn't a script used
for the "overwhelming majority of documents"
And what that script is if there is one; for example,
http://www.country-studies.com/india/linguistic-relations.html would seem to
indicate Latn would be, not Deva -- but I share some of Peter's qualms about
We don't qualify what "overwhelming majority" really means, but my take
would be at least two standard deviations (>95%).
John said that he was going to make a proposal. John, as part of that, can
you list the sources that have come up here and any others, with your
assessment of the reliability of each source? That way we can all be working
off of the same list.
On 9/6/07, Randy Presuhn <randy_presuhn at mindspring.com> wrote:
> Hi -
> > From: "John Cowan" <cowan at ccil.org>
> > To: "Peter Constable" <petercon at microsoft.com>
> > Cc: <ietf-languages at iana.org>
> > Sent: Thursday, September 06, 2007 11:26 AM
> > Subject: Re: Konkani Suppress-Script
> > Indeed. Collecting samples can serve as evidence that a particular
> > is in use for Konkani, but it cannot serve as evidence that one
> > script is dominant.
> Mere dominance isn't enough. RFC 4646 says:
> The field 'Suppress-Script' MUST only appear in records whose 'Type'
> field-value is 'language'. This field MUST NOT appear more than one
> time in a record. This field indicates a script used to write the
> overwhelming majority of documents for the given language and that
> therefore adds no distinguishing information to a language tag. It
> helps ensure greater compatibility between the language tags
> generated according to the rules in this document and language tags
> and tag processors or consumers based on RFC 3066. For example,
> virtually all Icelandic documents are written in the Latin script,
> making the subtag 'Latn' redundant in the tag "is-Latn".
> In this particular case, all the sources cited so far seem to indicate
> no particular script meets the "overwhelming" requirement for this
> One might also call into question whether there is a sufficient corpus of
> data in this language that was tagged under the RFC 3066 regime to
> make compatibility an issue. *Either* of these points would be sufficient
> to justify removal of the inappropriate "Suppress-Script".
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages