A proposed solution for descriptions (was: Re: ISO 639 - New item
approved - N'Ko)
Doug Ewell
dewell at adelphia.net
Sun Jun 11 06:13:27 CEST 2006
Mark Crispin <mrc at CAC dot Washington dot EDU> wrote:
> The problem is that you guys are trying to resolve conflicting desires
> into a single name. Long experience tells me that this doesn't work,
> and ultimately forces the registry into wretched compromises that
> displease everybody.
Richard Ishida <ishida at w3 dot org> wrote:
> In the case of the actual registry, there currently is no N'Ko ASCII
> text, and one would have to type N’Ko to get a match, knowing
> the right code point to use, and how to represent that as an NCR. You
> cannot google that by typing in N'Ko. I don't think that situation is
> very helpful to the average user.
Originally I was opposed to adding new Description values to solve this
problem, but Mark's and Richard's arguments have thoroughly convinced me
that this is necessary, and isn't a slippery slope that would lead to
dozens of Description strings for every subtag. I stand corrected, and
no, I don't mind being called a flip-flopper.
I hereby propose some changes to the Description fields of 28 existing
records, based on the following issues that presented themselves more or
less in this order.
1. With the addition of N'Ko the language, the Registry now has 14
subtag records with Description fields that include a non-ASCII
character (and therefore a hex NCR). I propose that for each of these,
a corresponding ASCII-only Description be added. Example: "N’Ko"
will be joined by "N'Ko". This applies not only to apostrophes, but to
all non-ASCII characters such as accented letters: "Volapük" will be
joined by "Volapuk". This solves most of the problem described by
Richard.
2. Conversely, those subtags that have a Description with an ASCII
apostrophe should have a corresponding Description added with the
appropriate non-ASCII directional apostrophe or modifier letter.
Example: "Mi'kmaq" will be joined by "Miʼkmaq". This should
answer the concerns of Michael and others that a Description in "the
correct characters" be available for all subtags.
3. A few names (Gwich'in, Ge'ez) currently have the *wrong* non-ASCII
apostrophe. I propose that these be changed to a more appropriate
character, as well as adding the pure-ASCII equivalent. Example:
"Gwich´in" will be deleted and two new Description fields,
"Gwichʼin" and "Gwich'in", will be added. This also answers a
concern raised by Michael.
4. Some subtags were found to have a Description with a second name in
parentheses, which is really an alternate name rather than a qualifier
of the first name. In the case of script subtag "Hano", the Description
"Hanunoo (Hanunóo)" already does what we are trying to achieve: it
provides ASCII and non-ASCII equivalents for the same name. This should
be replaced by two new Description fields, "Hanunoo" and "Hanunóo".
5. Likewise for a Description like "Lepcha (Róng)", it doesn't
make sense to repeat the "Lepcha" part simply to provide an ASCII and
non-ASCII version of "Róng". What would make sense would be to split
this into three Descriptions: "Lepcha", "Róng", and "Rong".
6. For that matter, any Description fields with an alternate name in
parentheses (not a qualifier) should really be split into multiple
Descriptions, regardless of whether non-ASCII characters are present.
Example: "Falkland Islands (Malvinas)" should be split into "Falkland
Islands" and "Malvinas". This is what we did with language subtags,
which are separated by semicolons in ISO 639: we converted them to
multiple Description fields. What I propose is that we do this
consistently with scripts and regions as well.
Note that items 4 through 6 have no effect on Description fields where
the parenthesized portion acts as a qualifier to the unparenthesized
portion. For example, "Cyrillic (Old Church Slavonic variant)" would
NOT be split into "Cyrillic" and "Old Church Slavonic variant" since
this would make no sense, and would give "Cyrl" and "Cyrs" the same
Description.
7. Finally, getting back to the apostrophe issue, it appears that the
language Amis, represented by the grandfathered tag "i-ami", should not
have an apostrophe at all. This was listed as 'Amis in the RFC 1766
registration form dating back to 1999, and so it was copied that way to
the initial RFC 3066bis Registry, but apparently this was a typo or
editing error. I propose changing this to "Amis".
In a separate mail I will present proposed registration forms for all 28
subtags that are affected in one way or another by these issues. They
are severable; each should be considered and discussed by the group on
its own merits. We aren't really constrained by time on this, but we
should keep the discussion moving so that the appropriate changes (as
agreed by the list) can be made to the Registry.
--
Doug Ewell
Fullerton, California, USA
http://users.adelphia.net/~dewell/
More information about the Ietf-languages
mailing list