jjc at hplb.hpl.hp.com
Tue Mar 9 09:27:43 CET 2004
I have some comments on this draft - particularly section 2.3!
1) I like the generativity
2) I am uncomfortable with the defaults
particularly point 2
I worry that the lang tag 'fr' becomes ambiguous with these rules -
between 'fr-FR' and 'fr' with location undefined or unknown or unimportant.
I also suspect that many british english speakers could interpret this
to mean 'en' == 'en-GB' while over the pond 'en' == 'en-US'
IIRC this problem has been there since 1766. As 3066bis is a move
forward, and I don't have a solution, I am not particularly seeking any change.
I think we already have the situation where different people mean
different things with the same tag - not good.
3) 2.3 point 5
Hmmm, how about 'und-latn': I can probably write a simple program to
determine the script of a string, and it is probably useful in some cases
to know that the script is at least something you can read (probably more
so with pictograms). An alternative would be to allow the primary subtag to
be omitted e.g. allow 'latn' as a full tag,
4) 2.3 point 7a
The use of surrogates may be necessary. It might be worth reserving
some of the private use space, e.g. the example uses qx, which has earlier
been described as one of the 'user-assigned codes' (section 2.2). Or simply
noting, in 2.2, that some provisions of 3066bis might actually assign for
public use some of the private use codes. (I am thinking of the poor user
who was making genuinely private use of 'qx' before it was taken up as a
surrogate, in the hypothetical example).
More information about the Ietf-languages