Comments on draft-04

Doug Ewell dewell at adelphia.net
Mon Jul 5 20:40:58 CEST 2004


Here are some comments on draft-04.  Some of these topics may be covered
in the most recent editor's draft-05.

Technical:

1.  Section 2.3, bullet 3 says users should use canonical subtags
instead of aliases; for example, "he" instead of "iw."  So far, so good.

Appendix C sets a date of March 1995, the publication date of RFC 1766,
for existing codes to be considered canonical and their replacements to
be considered "aliases."  The intent is that since language tags could
have been used as far back as March 1995, any codes that could have been
used in such tags must remain canonical.

To me this is wrong-headed, and less useful than the previous notion of
using January 2003 as the cutoff point, as proposed in previous drafts
of RFC 3066bis.

It means users are expected to use "ZR" instead of "CD" for the
Democratic Republic of the Congo, and "TP" instead of "TL" for
Timor-Leste, as well as the familiar case of "YU" for Yugoslavia
(presumably the old pre-1990s country with six republics, not the modern
Serbia and Montenegro, which of course is "891").  It is certainly
appropriate that such codes, used as far back as 1995-03, should
continue to be valid, but why do they need to be canonical?  There was
no concept of "canonical" language tags in RFC 1766.  Nothing is broken
by setting the cutoff point to 2003-01 instead.

There's more.  What about languages that originally had an alpha-3 code
in the original ISO 639, and then subsequently had an alpha-2 code
assigned sometime between 1995 and 2000 (when the RA-JAC policy to stop
doing this, cited in Section 2.2, went into effect)?  There are 37 such
codes.  Should the alpha-3 codes be made canonical over the alpha-2
codes simply because the former existed in 1995-03, when RFC 1766 was
published, and the latter did not?  This is exactly the same as
preferring "ZR" over "CD".

Setting the cutoff point for RFC 3066bis to 1995, before the
ramifications of certain code list changes were fully understood and
before modern stability policies were in effect, seems inappropriate.

2.  In Section 3.3, the rules about language ranges and registered
subtags have me confused.  The text says that "boont" is intended for
use with the language range "en" since Boontling is a dialect of
English, which makes sense.  It also says that "any registered subtag
MAY be incorporated into a variety of language tags," which implies than
I am also allowed to create "zh-boont" although of course it would be
silly to do so.  (The phrase "a variety of" isn't as clear as possible
here; does it mean "any language tag" or "some pre-determined set of
language tags"?)

A few paragraphs later, the draft says that a request to add "de" to the
language range for the subtag "nedis" WOULD be rejected (emphasis mine)
because it would change the meaning of "nedis."  Wouldn't this be up to
the reviewer on a case-by-case basis?  What if there really were such a
thing as a Natisone dialect of German?  Would this affect my ability to
write "de-nedis" in any case?

3.  Page 6 says, referring to IANA-registered primary language subtags
of 4 to 15 characters: "Future registrations of this type will be
discouraged."

Page 7 says: "One of the grandfathered IANA registrations is
'i-enochian.'  The subtags 'enochian' could be registered as a primary
language subtag (assuming that ISO 639 does not register this language
first), making tags such as 'enochian-AQ' and 'enochian-Latn' valid."

Could the example in second passage be seen as implicitly *encouraging*
the registration of primary language subtags, even though the first
passage says this would be discouraged?

Would "i-enochian" continue to be canonical even if "enochian" were
registered?  If so, would there be a real benefit to registering
"enochian"?

4.  Page 9 says: "There may be at most one region subtag in a language
tag."

Similar sections for the primary language subtag and region subtag do
not include such a passage, but obviously the restriction applies to all
three.  Should the inclusion or non-inclusion of this note be consistent
across the three sections?  (semi-editorial)

5.  Appendix C speaks of tags being "marked as 'superseded' by this
document."  I searched through the description of the IANA registry in
Section 3.2 and couldn't find any provision for marking tags in this
way.  How would this be done?

Editorial:

6.  There are still several spelling errors that should be caught ASAP
before being enshrined in the RFC.  There are also a few odd grammatical
constructions.  I can submit these off-line if desired.

7.  There are still a few instances of "ISO639" and "RFC3066" without
space.  These should be corrected.  Also, "alpha2" appears on page 9;
this should be "alpha-2."

8.  In Appendix B, the examples of "tags that use extensions" should not
have bullets.  All of the other examples are indented only, without
bullets.

9.  Where a standard or RFC is listed in square brackets, followed by a
real bracketed numeric citation, the brackets should be removed from the
name of the standard.  For instance, in:

"The syntax of this tag in ABNF [RFC 2234] [11] is:"

the brackets should be removed from "RFC 2234" since they do not
indicate a citation.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/




More information about the Ietf-languages mailing list