Script codes in RFC 3066, 4 issues
jcowan at reutershealth.com
Wed Apr 9 12:43:11 CEST 2003
Mark Davis scripsit:
> The term writing system is often contrasted with script. There is no need to
> identify them; it is simpler to always use script:
> "A script is encoded by an ISO 15924 script code."
Peter Constable used "writing system", although most of his examples of
"same script, different writing system" are hypothetical. I suspect though
that vertical Han (in hard-copy or final-form document might be a distinct
writing system from horizontal Han.
> A. Restrict both the ISO codes and Ethnologue codes so that no new
> combinations are shorter than an older combination. Politically, I suspect
> the chances of this, the nicest tack, approach nil.
That would be a breach of the ISO rules, which say that if you can find
N documents in M different repositories you get an ISO 639-2 code. It's
already the case that no existing 639-2 coded language can get a 639-1 code,
which is a big win: e.g. Aleut, which is ALE, can't ever get a 2-letter code.
So the main issue is when a language without a specific ISO 639-2 code gets
one, e.g. cpf-hat (Haitian Creole French) might get hcf (-2) or hcf/hc (-2/-1).
> B. Allow non-shortest forms (but keep the shortest form restriction on ISO
> 639 codes), but provide a table of equivalancies somewhere (not necessarily
> associated with 3066bis). Not as nice, but politically feasible.
Historical RFC 1766/3066 practice has been to keep a list of deprecated codes
at http://www.iana.org/assignments/language-tags . For example, no-nyn
(Nynorsk) has been deprecated in favor of nn. Deprecated codes are, of course,
> 3. For compatibility, also we need that once a 3066bis code, forever a
> 3066bis code. That is, even if the Ethnologue or ISO remove/deprecate a
> code, that code is remains forever valid for use in a 3066bis subtag.
It should probably be entered into the deprecated table.
> 4. As now, any strings would be compared case insensitive. However,
> customarily the casing would be en-foo-Cryl-CH.
SIL practice is to uppercase its tags, so gem-BAR-Latn-DE. (BAR is Bavarian,
FOO is unassigned at present).
John Cowan jcowan at reutershealth.com www.ccil.org/~cowan www.reutershealth.com
"In computer science, we stand on each other's feet."
--Brian K. Reid
More information about the Ietf-languages