Script codes in RFC 3066, 4 issues

John Cowan jcowan at reutershealth.com
Wed Apr 9 12:43:11 CEST 2003


Mark Davis scripsit:

> The term writing system is often contrasted with script. There is no need to
> identify them; it is simpler to always use script:
> 
> "A script is encoded by an ISO 15924 script code."

Peter Constable used "writing system", although most of his examples of
"same script, different writing system" are hypothetical.  I suspect though
that vertical Han (in hard-copy or final-form document might be a distinct
writing system from horizontal Han.

> A. Restrict both the ISO codes and Ethnologue codes so that no new
> combinations are shorter than an older combination. Politically, I suspect
> the chances of this, the nicest tack, approach nil.

That would be a breach of the ISO rules, which say that if you can find
N documents in M different repositories you get an ISO 639-2 code.  It's
already the case that no existing 639-2 coded language can get a 639-1 code,
which is a big win:  e.g. Aleut, which is ALE, can't ever get a 2-letter code.

So the main issue is when a language without a specific ISO 639-2 code gets
one, e.g. cpf-hat (Haitian Creole French) might get hcf (-2) or hcf/hc (-2/-1).

> B. Allow non-shortest forms (but keep the shortest form restriction on ISO
> 639 codes), but provide a table of equivalancies somewhere (not necessarily
> associated with 3066bis). Not as nice, but politically feasible.

Historical RFC 1766/3066 practice has been to keep a list of deprecated codes
at http://www.iana.org/assignments/language-tags .  For example, no-nyn
(Nynorsk) has been deprecated in favor of nn.  Deprecated codes are, of course,
never reused.

> 3. For compatibility, also we need that once a 3066bis code, forever a
> 3066bis code. That is, even if the Ethnologue or ISO remove/deprecate a
> code, that code is remains forever valid for use in a 3066bis subtag.

It should probably be entered into the deprecated table.

> 4. As now, any strings would be compared case insensitive. However,
> customarily the casing would be en-foo-Cryl-CH.

SIL practice is to uppercase its tags, so gem-BAR-Latn-DE.  (BAR is Bavarian,
FOO is unassigned at present).

-- 
John Cowan  jcowan at reutershealth.com  www.ccil.org/~cowan  www.reutershealth.com
"In computer science, we stand on each other's feet."
        --Brian K. Reid


More information about the Ietf-languages mailing list