Request to register private-use variant subtags

Doug Ewell doug at ewellic.org
Sat Apr 7 22:49:09 CEST 2012


Gordon P. Hemsley wrote:

> An important part of what I'm doing involves a step once-removed
> between the Registry and the display of the names.

OK, so this is a separate layer that you are adding. This will prove 
important to the discussion.

> As you well know, many entries in the Registry contain multiple values
> for the "Description" field. This may because things have different
> names or whatever.

It is exactly for that reason. For example, "Spanish" and "Castilian" 
are the same language according to ISO 639-3.

> So the first step of what I'm doing involves
> deciding how to translate Descriptions into Names. (They are not the
> same thing—the registry has no concept of Name.)

To the extent that "names" are a different concept from "descriptions," 
the Registry doesn't encode "names" for such subtags because it does not 
attempt to encode or register their semantics. 'es' represents the 
language (Type field) that people generally refer to in English as 
"Spanish" or "Castilian" (Description fields). What that actually means, 
say, in terms of "how does this language differ from others," is up to 
the user of BCP 47, that is, the producer or consumer of a language tag 
that includes 'es'.

> In this process, Private Use subtags are handled specially. Since, as
> far as I can tell, such subtags have special semantics—in particular,
> that the Registry has no knowledge of their meaning—they do not
> receive a Name.

The Registry has no knowledge of the "meaning" of any subtag. The 
denotations—not just Description fields—of language, script, and region 
subtags are defined by ISO 639, 15924, and 3166 respectively, to the 
extent that those standards attempt to encode concepts instead of names. 
(Not all do.) The distinction between private-use and other subtags is 
one that you are creating. There is no difference in the Registry 
between the entries:

Type: region
Subtag: AA
Description: Private use
Added: 2005-10-16

and:

Type: region
Subtag: AC
Description: Ascension Island
Added: 2009-07-29

except that the former includes a Description field that contains the 
word "Private" while the latter does not. In fact, once you have started 
isolating subtags with "Private" from the others, there is really 
nothing to stop you from declaring script subtags 'Zxxx' and 'Zyyy' and 
'Zzzz' exceptional as well.

> In fact, they are excluded from my database as any
> "unregistered" subtag would be. As such, both a private-use subtag and
> an unregistered would be displayed literally in my system. Only
> subtags with corresponding Names in my database are processed before
> being displayed.

That is a distinction you have created, as seen above. And it causes 
other problems, as seen below.

> The problem comes when I want to test that the code is doing what I
> described above. In order to ensure that I get a clear and permanent
> separation between a subtag that gets a display name and a subtag that
> gets output literally, I use private use subtags—they are permanently
> reserved, so I don't run the risk of them accidentally getting an
> associated Name in the future.

You could perform this test against all five types of registered subtag 
(language, extlang, script, region, variant) by randomly generating a 
subtag value and checking the Registry, or your database, to ensure that 
the value isn't already registered, and iterating as necessary. (That's 
how random, guaranteed-nonexistent filenames are generated.) There's no 
need to have a predefined value that expressly means "nothing." That 
isn't what private-use subtags are for anyway.

> And this system works fine for language, region, and script subtags.
> It falls apart when it comes to variant subtags. There is no clear
> separation between a subtag that should have a display name and which
> should get output literally—they all potentially fall into the former
> category.

But claiming that private-use subtags should not have a "display name" 
while others should is your concept, not a BCP 47 concept.

> A subtag that "probably won't be registered" is not a
> concrete enough definition. Without formally and permanently reserving
> some variant subtags for private use, there will always be the
> possibility of any given subtag of being registered as a real variant.
> (The word of those currently involved that they'd never approve such a
> subtag is not good enough. Things change.)

Just check dynamically to make sure the test subtag is still not 
registered.

> The only way to guarantee that a variant "will almost certainly never
> be registered" is to permanently and officially reserve it as such.

Nothing in the Registry, private-use subtags or anything else, is 
reserved or registered to mean nothing.

I'm a developer who has written BCP 47 applications, and currently 
working on a new one (though at present I have essentially zero time to 
devote to it). In the past I pre-processed the Registry, reformatting 
some things and adding my own assumptions, and it turned out that not 
only did that add a noticeable burden for me every time the Registry was 
updated, but it didn't even work because not everyone shares my 
assumptions. It turned out to be better to accept the Registry verbatim, 
and draw a clear and careful distinction between what it says and any 
additional knowledge or assumptions I might add.

I believe that's the boat you're in now. Private-use does not mean "this 
subtag has no meaning"; it means "there is assumed to be a private 
agreement under which this subtag has a certain meaning." Thinking of 
private-use as "no meaning" is an assumption you are adding.

Additionally, by treating private-use subtags the same as unregistered 
ones, your assumption fails validity testing. A validating processor 
should accept the tag "qaa-Qaaa-QZ" as valid, and should reject the tag 
"eaa-Eaaa-EZ" as invalid because none of its subtags is registered. Your 
process would treat both tags as invalid, which is contrary to BCP 47.

For the reasons I've stated, and speaking partly as a programmer and 
partly as a BCP 47 Designated Expert, I think it would be a mistake to 
register a subtag of any type for the purpose "this subtag has no 
meaning" simply to solve a programming problem.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­ 



More information about the Ietf-languages mailing list