Region subtags and orthographic variants (was: Re: registration requests re Portuguese)

Peter Constable petercon at
Wed Apr 15 18:57:38 CEST 2015

If one wants to indicate dialectal variations (accent, lexicon, grammar), then (e.g.) "pt-PT" may be useful. 

If one want to indicate a particular orthographic variation, they may choose to use "pt-PT", but that is not a reliable indicator of orthographic choices in general usage; it may be within a limited usage context (e.g. internal to some app or some managed corpus), but not in the general case. But "pt-ao1990" can be used as a reliable indicator of an orthographic variation in the general case β€” no region subtag required.

Similarly, the variant subtag "pt-ao1990" is not going to be a reliable indicator of non-orthographic distinctions.

If one wants to indicate both dialectal and orthographic variations, then both region and variant subtags, e.g., "pt-PT-ao1990" may be useful.

I get the impression that you two are in violent agreement.


-----Original Message-----
From: Ietf-languages [mailto:ietf-languages-bounces at] On Behalf Of Doug Ewell
Sent: Wednesday, April 15, 2015 8:32 AM
To: ietf-languages
Subject: Re: Region subtags and orthographic variants (was: Re: registration requests re Portuguese)

Yury wrote:

> When marking content ' cases where it is desirable to indicate 
> the language used in an information object' [rfc5646], specifically in 
> cases where the distinction are made per the orthography standards 
> (e.g., 'pt' case), the 'region' element is unnecessary (extraneous) 
> either in 'prefix' of this list forms or in rfc5646-conforming 
> 'langtags' themselves.

Yury, you are not reading what I and others have written. I will try only one more time.

There are different aspects of language usage that might need to be identified in a BCP 47 tag. These include vocabulary choice, grammar, spoken accent, orthography, and many more.

Many of these aspects, or varieties, are regional in nature. That is, they are commonly associated with a given geographical region.

In BCP 47, region subtags are the way to identify language varieties that can be associated with a "region" as defined by ISO 3166 or UN M.49. Other varieties are identified with a variant subtag.

There is NOT guaranteed to be a relationship between (a) regional varieties of language usage and (b) choice of orthography.

Because of this, it may be desirable to indicate both regional variety and orthography in the same BCP 47 tag. The way to do this is with a tag that includes both a region subtag and a variant subtag.

An example, using currently defined subtags, is "de-AT-1996". This tag means "German, as used (spoken, written) in Austria, using the 1996 orthography."

It is not ALWAYS necessary to include either the region subtag 'AT' or the variant subtag '1996'. But it IS necessary to include both of them if both pieces of information are important to describe the content.

Continuing to assert that the region subtag in such a tag is "unnecessary" or "extraneous," simply because there is also a variant subtag that indicates the orthography, does not make it so.

> In this specific case, you don't have any 'extlang', so you don't have 
> to over-specify in the 'Prefix' field [rfc5646, p.41].

I have no idea what extlangs have to do with this, and no idea what part of Page 41 you are looking at.

Doug Ewell | | Thornton, CO πŸ‡ΊπŸ‡Έ

Ietf-languages mailing list
Ietf-languages at

More information about the Ietf-languages mailing list