Encoding scripts in tags: evil or just unpleasant?

Michael Everson everson at evertype.com
Fri May 23 14:36:17 CEST 2003

John's contribution is most appreciated.

At 07:16 -0400 2003-05-23, John Cowan wrote:
>Michael Everson scripsit:
>>  >Well, I started out in your position and have now moved to approving of
>>  >Peter's productive langage-script-country model (where "country" is a
>>  >proxy for spelling system, basically).
>>  Edberg I suppose you mean. With what detail? Exactly as he presented
>>  it? He presented options. Do we have consensus on which ones? That
>>  would be guidance to the reviewer.
>Sorry for (obviously) worse confusing the issue; I was referring to Peter
>_Constable_'s model as expressed in "Language identification and IT"
>at http://www.sil.org/silewp/2000/001/SILEWP2000-001.pdf


>  > >I think the problem is now well enough understood since the publication
>>  >of Peter's papers that we can move past the minimalist position of
>>  >1766/3066
>>  >to something more detailed.
>>  Please turn it into guidelines.
>Here's my best shot:
>1) If a language not yet in ISO 639 is requested, register it using an
>    ISO 639 tag qualified by an Ethnologue tag.

Or some other tag. Ethnologue doesn't cover everything.

>2) If a language is written in multiple scripts, register each script
>    using the tag for the language qualified by an ISO 15924 tag for
>    the script.  Evidence should be demanded showing that the language
>    is indeed written in that script.

Each? So we have to register yi-Hebr? ga-Latn and ga-Ogam? pt-Arab and pt-Latn?

What about the DUPLICATION OF CODES issue? Isn't no/ny/nb a problem as well?

>3) If a language has multiple spelling systems de jure or de facto
>    by country or subdivision thereof, register each spelling system
>    using the tag for the language qualified by an ISO 3166-1
>    tag for the country, subqualified if necessary by an ISO 3166-2
>    tag or ad hoc tag for the subdivision.  Except in the case of
>    an ISO 639 language qualified by a country, evidence should be demanded
>    that there is indeed a national spelling system for that language in
>    that country.  ("Spelling" should be construed broadly.)

This is within one particular script? Examples other than the German ones?

>4) If a language is written in multiple scripts *and* has multiple
>    spelling systems etc. etc., register each spelling system in use for
>    each script using the tag for the script of the language qualified
>    by an ISO 3166-1 etc. etc.  Evidence should be demanded etc.


>5) Sign languages are an exception to this, and should be registered
>    using the ISO 639 tag SGN qualified by a country and possible
>    subdivision, as at present.

They're fine.

>6) Requests for registration of anything else must be processed ad hoc
>    and according to the judgement and taste of the reviewer.  This
>    includes dialect registrations, spelling systems valid only within
>    certain dates, and so on.

Is it being nasty to backtrack and wonder why <script="Latn"> isn't a 
cleaner solution than rolling all this into the <lang="yi"> tag? (I 
suppose this had better be asked again at this stage.)
Michael Everson * * Everson Typography *  * http://www.evertype.com

More information about the Ietf-languages mailing list