New Last Call: 'Tags for Identifying Languages' to BCP

Doug Ewell dewell at
Sun Dec 19 05:37:23 CET 2004

Bruce Lilly <blilly at erols dot com> wrote:

>> If you can write a reasonable "grandfathered" production in ABNF that
>> will allow this set of tags and no others, such that the ABNF can be
>> used without also referring to the prose, then I salute you.
> If there really are only 24 items of less than 11 octets each,
> a trivial solution is to simply list them (with the usual ABNF
> syntax) as literal strings.  That should take no more than a
> half-dozen lines.

Listing the 24 literal strings doesn't seem like a particularly elegant

Look, RFCs 1766 and 3066 both had ABNF that was insufficient to describe
the range of valid language tags, and AFAIK they were not greatly
criticized for this.  Here is the *entire formal syntax* from RFC 3066,
copied and pasted directly:

-----begin pasted material-----
   The syntax of this tag in ABNF [RFC 2234] is:

    Language-Tag = Primary-subtag *( "-" Subtag )

    Primary-subtag = 1*8ALPHA

    Subtag = 1*8(ALPHA / DIGIT)
-----end pasted material-----

Would anyone conclude from this that "a-b-c" or "xyz-1-2-3-44444444" or
such were valid RFC 3066 language tags?  They would not, because the
text of RFC 3066 is very clear about the additional restrictions that go
hand-in-hand with the ABNF.  The same is true for RFC 3066bis.

RFC 2231, which you have mentioned often in this thread, has the
following as part of its ABNF:

-----begin pasted material-----
   charset := <registered character set name>

   language := <registered language tag [RFC-1766]>
-----end pasted material-----

If this type of syntax specification is good enough for RFC 2231, why
wouldn't it be good enough here?

-Doug Ewell
 Fullerton, California

More information about the Ietf-languages mailing list