comments on the draft - 2

Fri Jun 11 20:57:59 CEST 2004

Dear Peter,

Mark and I have considered your comments in this email and have made modifications to the editor's copy (see http://www.inter-locale.com/ID/draft-phillips-langtags-04.html) and the issues list as appropriate. The changes to the editor's copy will be submitted shortly to the IETF, when we have completed work on some other items and comments.

I have placed interlinear comments below with our specific responses to this message.

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
http://www.webMethods.com
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International

Internationalization is an architecture. 
It is not a feature.

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no 
> [mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Peter Constable
> Sent: 2004年6月8日 0:23
> To: ietf-languages at alvestrand.no
> Subject: comments on the draft - 2
> 
> 
> Some further comments:
> 
> Section 2.3:
> 
> Item 3 after par. 3:
> 
> <quote>
>    3.  When a language has both an ISO 639-1 2-character code and an ISO
>        639-2 3-character code, you MUST use the ISO 639-1 2-character
>        code.
> </quote>
> 
> I might have suggested before that we should enumerate the precise,
> fixed list of 2-character ISO 639-1 IDs that should be allowed in an
> appendix. These would consist of those that exist at present. This will
> remove any possible concern that a 2-character ID will be added at some
> point to ISO 639-1 where a 3-character ID previously existed in ISO
> 639-2. There has been reference to a "freeze", but I consider it a not
> so great idea to have stability of this protocol dependent on another
> standard being unnecessarily constrained, and inappropriate to expect
> that that standard should be unnecessarily limited in its ability to
> meet the need of some users because of concerns that lie within a
> different, consuming protocol.
> 
> This would mean removing the length NOTE after point 6 (which, while
> carried forward from RFC 3066, I realize, is problematic IMO in that it
> is presented as a quotation yet has no source reference).

We have rejected this comment. Our reasoning is that this is extant text that there is no reason to change at present. 
> 
> 
> Item 4 after par 3:
> 
> <quote>
>        NOTE: At present
>        all languages that have both kinds of 3-character code also are
>        assigned a 2-character code, and the displeasure of developers
>        about the existence of two different code sets has been
>        adequately communicated to ISO. So this situation will hopefully
>        not arise.
> </quote>
> 
> This is a provoking comment that isn't really necessary. The 22 cases of
> differences had a history, the members of the ISO 639 Joint Advisory
> Committee has for some time been aware of the undesirability of such
> differences, and did not need the authors to be informed by anyone
> regarding the displeasure of developers to determine that they do not
> want to create any new such cases. I would simply say,
> 
> <suggested>
> NOTE: At present, all languages that have distinct "B" and "T" 
> identifiers in ISO 639-2 are also assigned a 2-character identifier
> in ISO 639-1. It is unlikely that a situation will arise in which 
> distinct "T" and "B" ISO 639-2 identifiers exist but no 2-character
> identifier exists, but should such a situation arise, it will be
> clear which must be used.
> </suggested>

Although we are sympathetic to your comment, we have rejected it as an unnecessary modification. The current text may be slightly confrontational, but serves the purpose and is not original to this document.
> 
> 
> 
> Section 2.4, par 4: There appears to be ambiguous usage of "tag" between
> the meanings 'a symbolic identifier as defined in this specification'
> and 'a declaration of linguistic properties of information objects'.
> Specifically, some of the bullet points seem to refer to multiple values
> as a "tag":
> 
> <quote>
>    The relationship between the tag and the information it relates to...
> 
>    o  For a single information object, it could be taken as the set of
>       languages...
> 
>    o  For an aggregation of information objects, it should be taken as
>       the set of languages...
> 
>    o  For information objects whose purpose is to provide alternatives,
>       the set of tags associated with it should be regarded as a hint
>       that the content is provided in several languages, and that one
>       has to inspect each of the alternatives in order to find its
>       language or languages. In this case, a tag with multiple
> languages...
> </quote>

We have accepted this comment. We have changed to text in each example to reflect "the set of tags associated with (the example)" so that it is clear.

> 
> A tag as defined in this RFC cannot denote multiple languages (unless it
> uses a collective ID from ISO 639-2 -- but I don't think that's what was
> in mind).
> 
> Again, I know this was carried forward from the previous RFC (so I
> should have caught it when I was reviewing the draft for that five years
> ago).
> 
> 
> Section 2.4.1, par 1: Wording could be tightened up (is a language range
> a set or a symbol?). 
> 
> <quote>
>    A Language Range is a set of languages whose tags all begin with the
>    same sequence of subtags. The following definition of language-range
>    is derived from HTTP/1.1 [14].
> 
>       language-range = language-tag / "*"
> </quote>
> 
> What's given in the rule is not a definition but a grammar. The opening
> sentence contains a definition, but the definition describes something
> other than the thing produced by the grammar (one's a set of languages,
> the other is a set of formal-language sentences). Here's a suggested
> revision:
> 
> <suggestion>
>    A Language Range is a set of languages whose tags all begin with the
>    same sequence of subtags. A given language range can be represented
> by
>    the sequence of subtags that is common to the languages in the given 
>    set. The specification for language-range tags is as follows, taken 
>    from HTTP/1.1 [14].
> 
>       language-range = language-tag / "*"
> </suggestion>

We have accepted this comment. We note that the original text was quite subtle, in that there was a *real* semantic difference between a "Language Range" (concept, set of languages) and a "language-range" (symbol for that set). We have modified the text to make this distinction clearer and provided additional examples, etc.
> 
> 
> 2.4.3: Is this saying that extensions should be put into alphabetical
> order when *generating* tags, or when *comparing* tags? 
> 
> Also, in par 3, it says "... is correctly ordered...": is "correctly"
> the appropriate word here, or is "canonically" better? The bottom line
> is *can* I tag data or send a request using (e.g.)
> "en-B-ext3-ext2-A-ext1"? Does the RFC permit me to do so or not? 

We have accepted this comment. We made it clear that comparing tags using the default fallback mechanism requires canonicalization. And that it is ("merely") recommended when generating tags (albeit strongly recommended). The resulting text has been modified in various ways, so a re-read of that section is called for. One of the side effects is that we discuss the difference between validating and well-formed language tag processors.
> 
> 
> 3.1: Re the registration form: Is there some IETF policy that restricts
> us to ask only for the native name of a language *transcribed into
> ASCII*? 

We have rejected this comment. As far as we know the business of the IETF and IANA is conducted in ASCII. 
> 
> 
> 3.2, par 4: Will tags like "zh-Hant" and "en-boont" be marked as
> *obsoleted* or *superceded*? Here you say "obsoleted"; back in section
> 2.2.1 you said "superceded".

We have accepted this comment. The word is "superceded".
> 
> 
> 
> All for now.
> 
> 
> 
> Peter Constable
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages