Counting Heads

Mark Davis mark.davis at
Fri May 30 09:47:28 CEST 2003

There are an indefinite number of ways one can interpret to "handle"
RFC 3066

isValidRFC3066(String possibleRFC3066name)

en-US => true
en-scouse => true
en-US-Davis => false
Mark-Davis => false
x-Mark-Davis => true

startsWithRFC3066(String possibleRFC3066name)

en-US => true
en-scouse => true
en-US-Davis => true
Mark-Davis => false
x-Mark-Davis => true

getDisplayName(String RFC3066name, String inRFC3066name)

en, en => English
fr, en => French
fr, fr => français
fr, en => anglais
en_US, en => English (US)
en_US, fr => anglais (États-Unis)
Mark-Davis, en => Mark-Davis // return code if no display name
Mark-Davis, en => ERROR // throw exception if not isValidRFC3066

NOTE: we've found it much better to just return the input string if we
don't have a display name; otherwise code doesn't handle new
registrations gracefully.


getDocumentsMatching(String pattern, String inRFC3066name)
openSpellCheckerFor(String RFC3066name)
// typically will only return a spell checker for a fraction of the
possible languages.

►  “Eppur si muove” ◄

----- Original Message ----- 
From: "Doug Ewell" <dewell at>
To: "John Cowan" <jcowan at>
Cc: <ietf-languages at>
Sent: Thursday, May 29, 2003 22:55
Subject: Re: Counting Heads

> John Cowan <jcowan at reutershealth dot com> wrote:
> >> You're going to need lookup tables anyway.  ISO 639-1 and 639-2
> >> language codes, IANA-registered language codes, and ISO 3166-1
> >> country codes all have to be stored in some kind of table, at
> >> so applications can tell when an invalid one is being used.
> >
> > Well, that's probably true for applications that apply tags, but
> > probably false for applications that interpret them, since such
> > applications will only handle a small subset of the thousands of
> > already on the list.
> Hmm, "handle"...  I'm sure most, maybe all, applications will NOT do
> anything special with every single language tag (e.g.
> voice synthesis, locale matching), but I would expect them to
> *recognize* all the tags.
> For example, they should know that "ba" and "bal" and "ban" (etc.)
> valid RFC 3066 tags, but not "bad" or "bag" (or "bak", which is
> ISO 639-2 but superseded by "ba").
> Can an application really "support" or "adhere to" RFC 3066 while
> recognizing one or two tags?
> Of course, I'm not counting apps that haven't kept up with the
> additions to the list.  I doubt there are many programs today that
> support Kashubian "csb".  Heck, with Thursday's new IANA
> even my LTag application
> isn't up-to-date any more.
> -Doug Ewell
>  Fullerton, California
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at

More information about the Ietf-languages mailing list