draft-phillips-langtags-08, process, sp ecifications,
"stability", and extensions
ned.freed at mrochek.com
ned.freed at mrochek.com
Thu Jan 6 15:44:38 CET 2005
> > Rather, the rule is simply that a country code, if present,
> > always appears as a two letter second subtag. The new draft changes this
> rule,
> > so applications that pay attention to coutnry codes in language tags have
> to
> > change and the new algorithm for finding the country code is trickier.
> Your text above says (a) "if there is a country code in the tag, it is the
> second subtag". That is not what text of RFC 3066 actually says, which is:
> > The following rules apply to the second subtag:
> > All 2-letter subtags are interpreted as ISO 3166 alpha-2 country...
> That is, it says (b) "if a second subtag has 2 letters, then it is an ISO
> 3166 code", which is not the same as (a). (It is almost, but not quite, the
> converse.)
Fine, whatever.
> The current RFC certainly does not forbid the use of country
> codes in other positions in language tags. One could absolutely register
> en-Latin-US, for example, meaning English as spoken in the US written in
> Latin script.
Sure, but my point was, is, and always has been that any 3066-compliant
implementation won't see this as a country code (unless it is table driven,
which brings up its own set of issues).
> There has been a lot of noise on this issue, and too few concrete examples.
No, what there has been is a lot of discussion of a real problem with no
apparent recognition of it as such by the draft authors. Your pejorative
characterization of this as "noise" does not make it so.
> In the so-called 3066bis draft, we have striven very hard to ensure that:
> (c) Every single tag that could be generated under RFC 3066bis is a tag that
> could have been registered under RFC 3066.
True but irrelevant.
> Thus if someone wrote a parser that is future-compatible -- that could parse
> all RFC 3066 language tags including those registered after the parser was
> deployed -- then that parser can handle all 3066bis language tags. This is a
> significant advance over RFC 3066, whose registered (not generated) language
> tags are atomic, and cannot be effectively parsed at all. 3066bis adds more
> structure so as to allow effective parsing of tags.
> If you *can* come up with tags that would show that (c) is invalid, that
> would be a concrete case that we would have to make adjustments in the draft
> for.
(c) is frankly not an issue I care one whit about. (Perhaps I should, but I
don't.) I don't register tags. I write code that processes, and more to the
point matches, tags. That's why I have issues with this draft.
> Moreover, all the talk about this being *too* complex is far overblown.
Again, your pejorative dismissal of other people's concerns does not
mean your position is valid.
> All
> 3066bis language tags can be parsed, including all the grandfathered codes,
> with a very short piece of code, or even with a regular expression (such as
> in Perl).
Of course you can write a short piece of code to parse this stuff. It's what you
do with it after you parse it that's a problem.
> This is not rocket science.
Parsing almost never is. But simply parsing these tag is not, and never has
been, the issue.
Ned
More information about the Ietf-languages
mailing list