draft-05: editorial comments (2)

Peter Constable petercon at microsoft.com
Sat Aug 28 00:13:15 CEST 2004

> From: Addison Phillips [wM] [mailto:aphillips at webmethods.com]

> "Were" is accurate, since it (perpetually) indicates origin of the

It may be historically accurate, but the present tense in English is
appropriate in standards for describing on-going status.

> It doesn't make any real difference though and I'll change it.


> Yes, I see your point. However "privately defined" doesn't convey it
> completely either, I think. How about:
>  "The single character subtag "x" as the primary subtag indicates that
>  the language tag consists solely of subtags whose meaning is defined
>  private agreement. See [#privateuse]."

I guess that's OK. Even though the 'FR' in x-afnoric-FR might mean
'France', that can only be because there is a private agreement that it
take its meaning from ISO 3166, not because this standard says so.

> > I think the "hopefully" clause can be worded more strongly:
> >
> > "At present all languages that have both kinds of 3-character code
> > are assigned a 2-character code; it is not expected that future
> > assignments of this nature will arise."
> This is text from RFC3066 that we previously toned down at your
request :-
> ).

I requested that? (Go figure.)

> > I've suggested before that a better way to deal with this is simply
> > limit the alpha-2 IDs to a fixed set, and since the reference to ISO
> > 639-1 has now been made indirect (valid subtags are taken from the
> > registry, and the content of the registry is controlled), this is
> > possible, and very simple. For instance, if the ISO 639/RA-JAC
> > changed their mind and added 'ha' for Hawaiian (for example), then
> > maintainer of this registry could simply choose not to add 'ha' to
> > registry.
> Indeed, except that there might (although it appears to be exceedingly
> unlikely) be a language given an alpha-2 from the get-go. No reason to
> forbid that. The ambiguity/stability rules in section 3 handle all of

It is certainly possible that the JAC could assign an alpha-2 and
alpha-3 concurrently, as was done last year for Haitian Creole.

Rather than saying "the JAC has promised never to add an alpha-2 for
Hawaiian" (or, rather, the generalization thereof), you could make
things completely clear by giving an active statement of how *this*
standard will be applied:

"In order to avoid instability of the canonical form of tags, if a
2-character code is added to ISO 639-1 for a language for which a
3-character code was already included in ISO 639-2, the 2-character code
will not be added as a subtag in the registry."

And, I would instruct the RA to add a *comment* for the alpha-2 ID
making clear that it is not part of the registry; e.g.

language; ht; Haitian; 2004-07-06; ;
language; hu; Hungarian; 2004-07-06; ;
# hw -- this code shall not be used; for Hawaiian, use "haw"
language; hy; Armenian; 2004-07-06; ;
language; hz; Herero; 2004-07-06; ;

> > Section 2.2: 'Registration of extended language subtags and
> > use MUST NOT be permitted.' Surely the appropriate wording here is
> > '...is not permitted.' Using "MUST NOT" suggests that the choice
> > regarding what kinds of registrations are permitted is potentially
up to
> > users of the spec.
> IIRC, we wanted to use normative language here. How about:
>   o Extended language subtags will not be registered except by
revision of
> this document.
>   o Extended language subtags MUST NOT be used to form language tags
> except
> by revision of this document.

That's better.

> > Section 2.2: 'ISO 15924[2]--"Codes for the representation of the
> > of scripts": alpha-4 script codes' -- the punctuation here is weird:
> > colon looks like it should be inside the quotation marks. Suggested
> > revision:
> >
> > '"Codes for the representation of the names of scripts" (alpha-4
> > codes)'
> Alas we don't make these names up... I'll double check the names.

It wasn't the name; it was the punctuation you used in proximity to the
quoted name.

> Okay, building on the above, how about:
>      Example: In the tag "fr-a-Latn", the subtag
>     'Latn' does not represent the script subtag 'Latn' defined
>      in the IANA Language Subtag Registry. Its meaning is defined
>      by the extension 'a'.

Looks good.

> > BTW, don't you need to limit possible singletons to exclude "y" or
> Why? Alpha order except 'x' sorts after 'z'. For clarity we should
> probably
> add a rule that says the private use sequence introduced by 'x' is at
> end (it's normatively defined in 2.2 #sources).

The right thing to do is ensure that "x" isn't ever referred to as a
singleton (it currently is), and wasn't permitted in the production of

Peter Constable

More information about the Ietf-languages mailing list