Addison Phillips [wM]
aphillips at webmethods.com
Fri Mar 19 19:22:14 CET 2004
Thanks for the taking the time to read the draft and provide comments. Some responses inter-linearly below.
Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
Internationalization is an architecture.
It is not a feature.
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no
> [mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Doug Ewell
> Sent: vendredi 19 mars 2004 00:53
> To: ietf-languages at iana.org
> Subject: draft-phillips-langtags-01
> I finally got a chance to really READ, not just zip through, the latest
> draft, and now I have a few questions. Please, no flames if these have
> already been answered.
> (1) Use of UN M49 codes
> In previous e-mails, Addison has stated that the draft makes it "very
> clear" that UN numeric country codes are to be used *only* in cases
> where the country's ISO 3166-1 code was recycled, as in the recent case
> involving CS, or for indicating regions like "Eastern Africa" that have
> M49 codes but not ISO 3166 codes. But when I read the draft, it's not
> really all that clear. It looks as though the U.S., let's say, could be
> represented with either US or 001. Perhaps it should be made more
> explicit in the draft.
I think the relevent text is in Section 2.2, bullet 4 of rules for region codes:
Three digit numeric codes from the UN Standard Country or Area Codes for Statistical Use must be used for countries with ambiguous ISO 3166 alpha-2 codes as defined in Rule 7a in Section 2.3. These codes may also be used for countries or regions for which no corresponding ISO 3166 code has been assigned, including supra-national and sub-national regions. Note: the alphanumeric codes in Appendix X of the UN document must not be used. (At the time this document was created these values match the ISO 3166 alpha-2 codes.)
I recognize that there is no outright prohibition on using the numeric code when an ISO3166 alpha2 is available and unambiguous. Mark and I will discuss adding one.
> (BTW, I now see my error in a previous post for saying that cs-CS was a
> valid language tag, though unlikely to appear in real life. While valid
> in RFC 3066, it's no longer valid in the present draft.)
> (2) Forbidding reassigned codes
> Section 2.3, rule 7a says that M49 numeric country codes are to be used
> instead of ISO 3166 alpha-2 codes "[i]n the event ISO 3166 assigns a new
> meaning" to the alpha code. The example given is that CS would forever
> refer to Czechoslovakia and 891 would be used for Serbia and Montenegro.
> However, the wording of the rule implies that only codes reassigned *in
> the future* would fall into this category. CS has *already* been
> recycled, which seems to imply
> What about *previously* reassigned country codes, such as AI for
> Anguilla (previously French Afars and Issas, reassigned in the 1980s) or
> SK for Slovakia (previously Sikkim, reassigned in 1993). Would those
> also be forbidden, so that users would have to use the numeric codes for
> Anguilla and Slovakia as well as Serbia and Montenegro? Or is there
> some difference (probably arbitrary) between codes reassigned in 2003
> and those reassigned earlier?
> In short, what is the cutoff date for recycled codes not to be valid?
> Is it:
> - the publication date of RFC 3066 bis (i.e. CS for Serbia and
> Montenegro would be OK)
> - from the beginning of ISO 3166 codes (i.e. AI and SK as well as CS
> would be forbidden)
> - some arbitrary date in-between (AI and SK are admissible, but not CS)
A cutoff needs to be established. I would probably choose 1 January 2003 so that the third choice in your list would apply. Part of the problem here, as I see it, is that the other two examples you cite are less compelling. I would be less militant about prohibiting all reuse if there were a sufficiently long rest period for alpha2 identifiers as a firm policy of ISO3166MA.
> (3) Registering variant subtags
> As the text points out, whereas RFC 1766/3066 registrations were for an
> entire tag (such as en-boont), RFC 3066 bis registrations are to be for
> the subtag (boont) only. The registration would include the country
> subtag as an informative field. Would the subtag still only be valid
> with the intended language, or would there be a loophole allowing an
> inappropriate combination such as fr-boont? The document didn't make it
> clear to me.
'fr-boont' is a valid tag under rfc3066:bis.
It is also a meaningless tag that one would not use in practice. I would tend to say that it isn't a loophole, it's a feature, though. Previously when one wrote support for rfc3066, the tags supported and their meaning was rigidly fixed the moment the compiler fired. With the new design, implementations can recognize subtags as being 'variants', even if the implementation doesn't know the meaning.
The statement "The registration would include the country subtag as an informative field." isn't quite correct. It would be better to say, as the document does, that the registration include the intended prefix(es) as an informative field. Thus 'boont' has an intended prefix of 'en-'.
> -Doug Ewell
> Fullerton, California
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
More information about the Ietf-languages