gc-ao1990 request - Galician Wikipedia

Luc Pardon lucp at skopos.be
Fri May 29 13:19:46 CEST 2015


OK, now we're talking. Thanks Michael.

First things first: whatever you may think, and speaking for myself: if
I was "beating you up" at all, it was certainly not "for not thinking as
I do". That's what debate is for. I was frustrated because you weren't
addressing the counter-arguments that were given to you, i.e. because
you were not listening. Hence the "brick wall". I could quote from your
messages etc. but you're listening now, so I'll leave it at that.

Next, let's go straight at the heart of the matter. You say:

> I’m not interested in opening up the floodgates to registering things just because they exist.

Here we have the fundamental issue, the root cause of all this mess.

You are a linguist, I am a technician, and we're at different sides of
the looking glass.

You seem to think of the Registry as a kind of Dewey system for
languages, an "Ethnologue bis", or whatever. In short: a taxonomy. From
that point of view, it makes sense to not "opening up the floodgates",
because registering everything that comes your way defeats the very
purpose of a taxonomy. The whole idea is to group similar things
together to get a grip on it all, and if your groups are too small,
you'll end up with groups of size one.

That does not change the fact that the groups must be able to
accommodate all things that the taxonomy is intended for. E.g. for every
single book out there, it must be possible to find an appropriate Dewey
number - otherwise a librarian cannot place the book on any of his
shelves and nobody will be able to read it.

This is where my side of the looking glass comes in. I'm sitting in a
world full of computers, and they are choke full of documents, and my
job (i.e. of technicians in general) is to write programs that can
process these documents.

For the sake of this discussion, let's limit the "documents" to
websites, and the "programs" to screenreader software that converts the
text of the website into speech so as to make them accessible to the
blind. To do that, the screenreader must be able to determine the
language of the document, and that is precisely what BCP47 language tags
are intended for. The introduction to BCP47 makes that very clear.

So: if there is no language tag, the page cannot be accessed by the
blind. This is exactly analogous to a book with no Dewey number.

That leads to the conclusion: EVERY web page out there MUST have a
language tag that fits it, otherwise the page is useless.

Now we can start beating each other over the head on the size of
"every", but that doesn't change the basic concept, which is
fundamentally different from yours: I see the Registry as a catalog and
I want it to be as large as reasonably possible, you see it as a
taxonomy and you want to keep it as small as reasonably possible.

You are fighting to prevent the floodgates from opening, but from my
point of view, they were already opened by whoever came up with the idea
of a tagging system that would be able to identify the languages of the
world. So conceptually, there is a slot for everything. Some slots may
not be filled (yet), but that's what the registration function is for.

Before you discard this as rubbish, please consider that the Registry
was set up by people at _my_ side of the looking glass, i.e. by
engineers, not by linguists.

Yes, them engineers did enlist the help of linguists, and that brings up
the question what assistance is wanted from you. You bring that up
yourself, as you write:

> Maybe you think I should just rubber-stamp anything that comes in, but that hasn’t been my role in the past and I don’t think it’s the intention of the standard — otherwise why have a reviewer at all? 

That is reasonably easy to answer if we keep in mind that we (i.e. this
list) do not create or maintain a taxonomy, we essentially maintain a
catalog of existing "groups of things".

Applied to orthographies: we do not create spelling rules ourselves, we
simply maintain a catalog of all agreed-upon orthographies that exist
out there.

So what is (y)our role in this?

I think BCP47 spells that out reasonably clearly at the end of section
3.5, where it says:

> The purpose of the "Reference to published description" section
> in the registration form is to aid in verifying whether a language is
> registered or to which language or language variation a particular
> subtag refers. 

This is the entire point. Or two points actually: a) we don't want two
tags for the same thing, and b) given a document, we must be able to
find the one and only tag that fits it. These requirements are like the
two sides of the same coin.

For example, if Alice comes asking for a tag "pirate", we may ask
what the hell is "pirate" and how can we tell a document in "pirate"
apart from any other document out there?

So Alice will explain what she understands by "pirate". We will look in
our catalog for things with those characteristics, and we may be able to
tell her "oh, is that what you mean? Okay, Bob was already here for the
same thing, and there is already a tag "woodnleg" available". Or if she
replies "why, it is how pirates wrote", we may tell her that this is too
imprecise for future identification, because there were also Dutch,
Maniot, Saracen and Maleysian pirates around and we can't have one tag
covering them all.

And _that_ is where this list comes in. If Alice shows us a document
with treasure-finding instructions in "pirate", we ought to be able to
say no, this is just Oxford with some spelling mistakes, "en-oxendict"
will do just fine.

And this, I think, is the acid test to decide between approval and
rejection. If Alice can show us one single web page that she thinks
"untaggable", we can only reject her request if we can hand her an
alternative tag to use. If there is none, we _must_ approve her request.
Because _every_ document _must_ have a fitting tag.

Now, once we decide that the spelling is not yet in our dictionary, we
can start debating what code to assign to it. In the case of Oxford for
example, we must beat the full name into the straightjacket of BCP47
syntax, because that is part of our job too.

The important thing is this: while we may create a _code_ for OED,  we
don't create OED itself, we only record its existence in our catalog of
spellings. And by doing so, we certainly don't approve _of_ OED, we
simply record the fact that some people out there have agreed to follow
some set of rules, and we have verified that the set is indeed
identifiable. That is all.

Once one accepts that point of view, everything else falls into place.

For example, you write:

> I just love being insulted on this forum. I get it a lot, too. And then people beat me up for not thinking as they do, or for using my own judgement. Which is what I’m supposed to do. Such a pleasure.

You may not believe this but I do understand your frustration.

However, could that be because we are all on this side of the looking
glass, pushing at the gates, and you are all alone on the other side,
struggling to keep them closed?

I've been on this list for several years now, and many times it was
"consensus minus one", the one being you. Has it ever occurred to you
that you may have misinterpreted what is expected of you? I'm really
asking, not trying to insult you.

Please do not dismiss this out of hand. After all, as I said, the
registry was conceived by engineers for use by engineers, and they
enlisted the help of a linguist to help build a tool for engineers.


So, what's next? You wrote:

> I asked you to please contact the user community for the prefix you think should be registered. Either you show that YOU need to use such a prefix, or that THEY need to, or that SOMEBODY does, or I just assume that you are doing this on theoretical grounds. 

Here is what I'm going to do. I will _not_ do what you ask. Instead, I
ask you to consider what I said above. The review period ends today, but
I'll let you time to reconsider till Monday or so. If your rejection
still stands then, I will start the appeal procedure.

You will probably see this as a threat, but I think it is important -
for you too - to get this issue cleared up once and for all, by learning
what exactly is the purpose of this Registry, and conversely what
exactly is expected of you, and by hearing it straight from the horse's
mouth at that.

Of course that's not the only reason why I won't do what you ask. There
are at least two other reasons. First, I do not think I need to prove a
personal need, so you may not ask me to do so as a precondition for
approval. Second, if you're not willing to change your point of view,
then I don't really expect that it will make any difference to you if I
showed you a statement from the editors that says "yes, we do want to
tag our website". You might still maintain that one site is not enough,
that this journal doesn't have enough readership, that gl-ao1990 is not
official etc. This is why I asked you what you want to hear from the
editors in order to overcome your reluctance. Again, if you see the
Registry as a Taxonomy, it makes sense for you to be reluctant. But I
think that this is misguided. Please reconsider.


Thanks for listening,

Luc Pardon



More information about the Ietf-languages mailing list