Request to register private-use variant subtags

Gordon P. Hemsley gphemsley at gmail.com
Sun Apr 8 00:41:53 CEST 2012


On Sat, Apr 7, 2012 at 6:07 PM, Doug Ewell <doug at ewellic.org> wrote:
> Gordon P. Hemsley wrote:
>
>>> In fact, once you have started isolating subtags with "Private" from
>>> the others, there is really nothing to stop you from declaring script
>>> subtags 'Zxxx' and 'Zyyy' and 'Zzzz' exceptional as well.
>>
>> And I have, in fact, done just that, along with 'Zinh', as well.
>
> You are really starting to be on your own at this point.

I don't see how this is an "us versus them" situation.

In the particular code I'm working with at the moment (which I again
assert should be irrelevant), an incoming language tag (e.g.
representing a spellchecker dictionary) is processed and then the
human-readable representation of that language tag is displayed to the
user. How would it help the user to display "Code for uncoded script"?
There would never be a spellchecker for an uncoded script, so why
would I include it? This doesn't seem all that strange to me....

>> In addition, the tests are being run to ensure that the correct name
>> is associated with the given language tag. Checking the list for
>> validity would create add circular logic to the test and render it
>> moot.
>
> string bogusVariant = "";
> do
> {
>   bogusVariant = generateRandomVariant();
> }
> while (Exists(bogusVariant));
> TestNonExistentVariant(bogusVariant);

I am not just testing variant subtags. I'm testing language tags as a
whole. Here are a few example lines:

is(isc.getDictionaryDisplayName("en-Cyrl-US"), "English (United
States) / Cyrillic", "'en-Cyrl-US' should display as 'English (United
States) / Cyrillic'");

is(isc.getDictionaryDisplayName("qaz-Qaaz-QZ-qxqaaaaz"), "qaz (QZ) /
Qaaz (qxqaaaaz)", "'qaz-Qaaz-QZ-qxqaaaaz' should display as 'qaz (QZ)
/ Qaaz (qxqaaaaz)'");

With just a single valid (non-private-use) subtag and a single
private-use subtag for each of language, region, script, and variant,
there are 54 possible combinations, plus addition tests for other
specific processing. I have generated these externally, and they
remain static. There is no reason for me to generate any of these on
the fly.

>>> But claiming that private-use subtags should not have a "display
>>> name" while others should is your concept, not a BCP 47 concept.
>>
>> I'm aware of that. This discussion is not supposed to be about how I
>> decide what gets a "display name" and what doesn't. This is a
>> discussion about whether there should be a variant subtag that has the
>> equivalent purpose as a private-use language, region, or script
>> subtag. Anything about what that private use actually is is outside
>> the scope of this discussion, I think.
>
> Fair enough. I claim that, for the purposes for which BCP 47 envisions
> private-use subtags, it is perfectly reasonable for there to be no such
> thing as a private-use variant; that is what it intends the -x- mechanism to
> be used for. You have a right to define a private agreement that says 'qaa'
> means nothing, 'Qaaa' means nothing, etc., but I don't believe this
> justifies changing anything about BCP 47 or the Registry to support this
> particular testing model.

However BCP 47 "envisions" something is what is written in the
standard itself. It seems to me that there are no restrictions in BCP
47 that say what I can or cannot do with existing private use subtags
nor that prevent private use variant subtags from being added to the
registry. If I am wrong on this, please show me.

Again, what I plan to do with the private-use variant subtag—what my
"private agreement" will be—should have no bearing on whether private
use variant subtags are added to the Registry.

> Sorry, you've reached the wrong audience here.

This is the place to request registration of new subtags, is it not?

>>> Nothing in the Registry, private-use subtags or anything else, is
>>> reserved or registered to mean nothing.
>>
>> As we've previously established, there is a step in between the
>> Registry and what I am displaying. In that intermediate step, I
>> translate "private use" to "has no meaning"—something I am completely
>> entitled to do.
>
> Since you are working with your own database, derived from the Registry but
> excluding certain subtags, I suppose you could exclude any registered
> variant subtag of the forms 'invalid0' through 'invalid9' and use those for
> your testing. As you said, nobody can flat-out 100% guarantee that such
> subtags will never ever be registered, but it does seem pretty remote.

My database is generated directly from the Registry. Certain values
are manually overridden, but most of the content is the same.

There are some subtags that I have opted to not include in my
database, because they are in some ways 'meta' subtags. However, I
have not ADDED any additional subtags to my database, which is what
you are suggesting. My database is a strict subset of the Registry.
You are suggesting that it be merely an intersection, which is not
something I am willing to do.

>>> For the reasons I've stated, and speaking partly as a programmer and
>>> partly as a BCP 47 Designated Expert, I think it would be a mistake
>>> to register a subtag of any type for the purpose "this subtag has no
>>> meaning" simply to solve a programming problem.
>>
>> It would be a subtag for the purpose of "Private use", of which there
>> are several already. My particular private agreement need not be
>> formalized in the Registry.
>
> I see no other purpose for a private-use variant subtag other than to
> support this particular processing architecture and testing strategy.

Let's look at it from a different perspective: What if I had some
internal orthographic convention that I used (perhaps a phonetic
alphabet à la 'fonipa')? If I wanted to use a variant subtag to
represent that, I'd have to come here to register that. But my
orthographic convention isn't really useful to the world at large. Or
maybe it's proprietary.

My only other option would be to stick it at the end of the language
tag in a -x- private use block. But that could be used for anything!
Like, if I need to keep my 'mac' strings different from other strings,
I might use 'en-US-x-mac'. Or I might use it to keep track of whose
dictionary fork it is ('en-US-x-ghemsley'). Or any number of other
possible uses. I'd have no way to keep my (internal) orthographic
convention/variant separate from the arbitrary contents of the -x-
private use extension.

I don't think that's right.

> Sorry,
> you'll have to try to convince others on this list. You may have better luck
> there.

I _am_ talking to everyone on the list.

-- 
Gordon P. Hemsley
me at gphemsley.org
http://gphemsley.org/http://gphemsley.org/blog/


More information about the Ietf-languages mailing list