Why not? [Re: [Fwd]: Response to Mark's message]

Jon Hanna jon at spin.ie
Fri Apr 11 12:29:50 CEST 2003


> Jon Hanna scripsit:
> 
> > Even "en-IE-Latn" would at least seems somewhat more natural, 
> (but that of
> > course would have the problem of being valid in the current 
> RFC3066, and one
> > might assume it was a subdialect of Hiberno-English).
> 
> The forms "en-ie-latn" and "en-latn-ie" *are* valid in RFC 3066; they just
> aren't preregistered, so you have to convince Michael Everson 
> that they are
> sensible.

Silly me. I was thinking of the restrictions on the size of the primary tag and somehow ended up applying it to the first subtag.

> But again, a barrier of script is much larger than a barrier of national
> variation.  I have no trouble with U.K. English, say, but I would be quite
> helpless confronted with English in Greek or Cyrillic (I would have to
> decode them, not read them).

That isn't necessarily more or less difficult, it's just different. The former can be trivial or difficult for a human (there are real-world examples of en-GB or en-IE that would be a lot more difficult for an American to understand that even my most hastily-written mails), but is generally quite tricky for a computer. The latter can often be done easily by computer, at least as far as producing a phonetic transliteration back to Latin script.

Indeed some real world changes in orthography can be performed very well algorithmically. I'm going to have to abandon the comforting familiarity of en-latn-IE for my examples here.

One would be Old English in roman letters or in futhark. Another would be modern Irish with the sí buailte or modern Irish using the letter h to represent it.

There are some tasks for which script is the only important thing (drawing characters on a screen), and some for which language is the only important thing (most of the purposes to which we currently put 3066).

Again, while I recognise the need to support those tasks for which both are needed, the more I think about this the more I think it's flawed to directly combine script information with language information.



More information about the Ietf-languages mailing list