comments on the draft - 2

Tue Jun 8 17:39:25 CEST 2004

All good points that I will add the to issues list for Mark and I to discuss.

Thanks!

Addison

Addison P. Phillips
Director, Globalization Architecture
webMethods | Delivering Global Business Visibility
http://www.webMethods.com
Chair, W3C Internationalization (I18N) Working Group
Chair, W3C-I18N-WG, Web Services Task Force
http://www.w3.org/International

Internationalization is an architecture. 
It is not a feature.

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no 
> [mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Peter Constable
> Sent: 2004年6月8日 0:23
> To: ietf-languages at alvestrand.no
> Subject: comments on the draft - 2
> 
> 
> Some further comments:
> 
> Section 2.3:
> 
> Item 3 after par. 3:
> 
> <quote>
>    3.  When a language has both an ISO 639-1 2-character code and an ISO
>        639-2 3-character code, you MUST use the ISO 639-1 2-character
>        code.
> </quote>
> 
> I might have suggested before that we should enumerate the precise,
> fixed list of 2-character ISO 639-1 IDs that should be allowed in an
> appendix. These would consist of those that exist at present. This will
> remove any possible concern that a 2-character ID will be added at some
> point to ISO 639-1 where a 3-character ID previously existed in ISO
> 639-2. There has been reference to a "freeze", but I consider it a not
> so great idea to have stability of this protocol dependent on another
> standard being unnecessarily constrained, and inappropriate to expect
> that that standard should be unnecessarily limited in its ability to
> meet the need of some users because of concerns that lie within a
> different, consuming protocol.
> 
> This would mean removing the length NOTE after point 6 (which, while
> carried forward from RFC 3066, I realize, is problematic IMO in that it
> is presented as a quotation yet has no source reference).
> 
> 
> Item 4 after par 3:
> 
> <quote>
>        NOTE: At present
>        all languages that have both kinds of 3-character code also are
>        assigned a 2-character code, and the displeasure of developers
>        about the existence of two different code sets has been
>        adequately communicated to ISO. So this situation will hopefully
>        not arise.
> </quote>
> 
> This is a provoking comment that isn't really necessary. The 22 cases of
> differences had a history, the members of the ISO 639 Joint Advisory
> Committee has for some time been aware of the undesirability of such
> differences, and did not need the authors to be informed by anyone
> regarding the displeasure of developers to determine that they do not
> want to create any new such cases. I would simply say,
> 
> <suggested>
> NOTE: At present, all languages that have distinct "B" and "T" 
> identifiers in ISO 639-2 are also assigned a 2-character identifier
> in ISO 639-1. It is unlikely that a situation will arise in which 
> distinct "T" and "B" ISO 639-2 identifiers exist but no 2-character
> identifier exists, but should such a situation arise, it will be
> clear which must be used.
> </suggested>
> 
> 
> 
> Section 2.4, par 4: There appears to be ambiguous usage of "tag" between
> the meanings 'a symbolic identifier as defined in this specification'
> and 'a declaration of linguistic properties of information objects'.
> Specifically, some of the bullet points seem to refer to multiple values
> as a "tag":
> 
> <quote>
>    The relationship between the tag and the information it relates to...
> 
>    o  For a single information object, it could be taken as the set of
>       languages...
> 
>    o  For an aggregation of information objects, it should be taken as
>       the set of languages...
> 
>    o  For information objects whose purpose is to provide alternatives,
>       the set of tags associated with it should be regarded as a hint
>       that the content is provided in several languages, and that one
>       has to inspect each of the alternatives in order to find its
>       language or languages. In this case, a tag with multiple
> languages...
> </quote>
> 
> A tag as defined in this RFC cannot denote multiple languages (unless it
> uses a collective ID from ISO 639-2 -- but I don't think that's what was
> in mind).
> 
> Again, I know this was carried forward from the previous RFC (so I
> should have caught it when I was reviewing the draft for that five years
> ago).
> 
> 
> Section 2.4.1, par 1: Wording could be tightened up (is a language range
> a set or a symbol?). 
> 
> <quote>
>    A Language Range is a set of languages whose tags all begin with the
>    same sequence of subtags. The following definition of language-range
>    is derived from HTTP/1.1 [14].
> 
>       language-range = language-tag / "*"
> </quote>
> 
> What's given in the rule is not a definition but a grammar. The opening
> sentence contains a definition, but the definition describes something
> other than the thing produced by the grammar (one's a set of languages,
> the other is a set of formal-language sentences). Here's a suggested
> revision:
> 
> <suggestion>
>    A Language Range is a set of languages whose tags all begin with the
>    same sequence of subtags. A given language range can be represented
> by
>    the sequence of subtags that is common to the languages in the given 
>    set. The specification for language-range tags is as follows, taken 
>    from HTTP/1.1 [14].
> 
>       language-range = language-tag / "*"
> </suggestion>
> 
> 
> 2.4.3: Is this saying that extensions should be put into alphabetical
> order when *generating* tags, or when *comparing* tags? 
> 
> Also, in par 3, it says "... is correctly ordered...": is "correctly"
> the appropriate word here, or is "canonically" better? The bottom line
> is *can* I tag data or send a request using (e.g.)
> "en-B-ext3-ext2-A-ext1"? Does the RFC permit me to do so or not? 
> 
> 
> 3.1: Re the registration form: Is there some IETF policy that restricts
> us to ask only for the native name of a language *transcribed into
> ASCII*? 
> 
> 
> 3.2, par 4: Will tags like "zh-Hant" and "en-boont" be marked as
> *obsoleted* or *superceded*? Here you say "obsoleted"; back in section
> 2.2.1 you said "superceded".
> 
> 
> 
> All for now.
> 
> 
> 
> Peter Constable
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages