draft-phillips-langtags-08, process, sp ecifications,
"stability", and extensions
ned.freed at mrochek.com
ned.freed at mrochek.com
Wed Jan 5 06:58:41 CET 2005
> ned.freed at mrochek.com scripsit:
> > I know of two other wrinkles in the RFC 1766 world:
> Are you aware that RFC 1766 has been obsolete for four years now?
Of course I am.
> > (2) SGN- requires special handling, in that SGN-FR and SGN-EN are in fact
> > sufficiently different languages that a primary tag match should not be
> > taken to be a generic match.
> The same is true of the various registered zh-* tags.
Yes, forgot to mention that one. It is actually different and more important in
that the use-cases aren't the same as those for sign languages.
> > (a) Extension tags appear as the first subtags, and as such have to
> > be taken into account when looking for country subtags.
> Finding country codes is straightforward: any non-initial subtag of two letters
> (not appearing to the right of "x-" or "-x-") is a country code.
> This is true in RFC 1766, RFC 3066, and the current draft.
On the contrary, in RFC 3066 the rule is "any 2 letter value that appears as
the second subtag is a country code". The rule in the new draft is either the
formulation you give above or "any 2 letter value that appears as a subtag
after the initial subtag and some number of 3 and 4 letter subtags".
These aren't the same.
> > (b) Script tags change the complexion of the matching problem significantly,
> > in that they can interact with external factors like charset information
> > in odd ways.
> Can you clarify this? Charset information neither specifies nor necessarily
> restricts (except in text/plain) the script used to write a document.
And what if you're dealing with text/plain, as many applicationss do?
Just because something doesn't necessarily do something doesn't mean it
never does it.
> > (c) UN country numbers have been added (IMO for no good reason), requiring
> > handling similar to country codes.
> They provide for supranational language varieties and for stability in
> country codes which is inappropriate for ISO 3166 alphabetic codes (which
> are codes for country *names*).
I'm aware of what they provide (although I see no explanation of this
in the draft). I'm just not convinced that their addition is warranted.
> > The bottom line is that while I know how to write reasonable code to do RFC
> > 1766 matching (and have in fact done so for widely deployed software), I
> > haven't a clue how to handle this new draft competently in regards to
> > matching.
> The draft describes only the RFC 1766 (3066) algorithm, without excluding
> other algorithms to be defined later.
Well, maybe I'm missing something obvious, but I see nothing in RFC 3066 that
qualifies as a description of a matching algorithm. The new draft does include
such a description in section 2.4.2 - an improvement - but leaves any number of
details open. And we all know where the devil lives.
Side note: I don't think item 4 really belongs in the list in section 2.4.2.
It is a warning to implementors, not part of the matching mechanism.
More information about the Ietf-languages