OT: JFC's response to Peter and Doug

Addison Phillips addison.phillips at quest.com
Thu Mar 17 19:29:23 CET 2005

> - identification of the language (ISO 693 or the like) -> interrelation
> between people - we can talk

RFC 3066 and 3066bis call this "primary language"
> - internationalization (what discusses RFC 3066 - ISO 15924) ->
> interoperation between machines - we can write

RFC 3066bis calls this "script"
> - multinationalization (taking into account ISO 3166 or other geopolitical
> attribute) -> interculturization between communities - we can share same
> cultural references, the same meanings (for example, what may be missing
> in
> here).

RFC 3066 and 3066bis call this "region"
> - multilingualization (semantic, grammar, dictionary, syntax, etc.) ->
> interintelligibility -> we can understand each other

RFC 3066bis calls this a "variant"--variations within a language, which may take many forms (orthography, dialect, etc. etc.). RFC 3066 allows entries that represent this to be registered (but doesn't identify them explicitly with a name). Both allow multiple variations to be applied to a text.
> - vernacularization (procedures, styles, tools, etc.) -> interusability ->
> we can do something together

Language tags identify text, not the processes that are applied to them. A French text that has had French hyphenation rules applied to it is just a French text. Particular vernacular distinctions may be candidates for registration, though, although this dives into more difficulty: is it another language or merely stylistic license? Merriam-Webster defines vernacular thus:

1 a : using a language or dialect native to a region or country rather than a literary, cultured, or foreign language b : of, relating to, or being a nonstandard language or dialect of a place, region, or country c : of, relating to, or being the normal spoken form of a language

In other words, Twain's "Pudd'nhead Wilson" is written in a particular vernacular form of U.S. English. One can register a variant (under 3066bis--or a tag under 3066) that conveys this distinction (perhaps "en-US-twain"):

"Say it ag'in! En keep on sayin' it! It's all de pay a
body kin want in dis worl', en it's mo' den enough.
Laws bless you, honey, when I's slav' aroun', en dey 'buses me,
if I knows you's a-sayin' dat, 'way off yonder somers,
it'll heal up all de sore places, en I kin stan' 'em."

Generally though this is considered unnecessary. There are many vernaculars in the world. Only in exceptional cases does it become necessary to identify a particular language variation as distinct. But this does not obviate the point that RFC 3066 (and its proposed successor) already provide a mechanism for doing exactly that to identify text in the cases where the differences matter.

(For that matter, 3066bis relaxes strictures surrounding the use of private use subtags so that specialists can identify various dialects or vernacular usages among themselves without the use of registration: "en-US-x-twain".)

So how is what you're saying different from RFC 3066bis? Heck, how is it different from RFC 3066?!?

Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group

Internationalization is not a feature.
It is an architecture. 

More information about the Ietf-languages mailing list