A proposed solution for descriptions

Doug Ewell dewell at adelphia.net
Wed Jun 21 16:38:09 CEST 2006


Michael Everson <everson at evertype dot com> wrote:

>> It's clear that we aren't going to reach any common ground on the 
>> question of registering ASCII-only equivalents of names like "Bokmål"
>
> There was no reason to change that by adding an ASCII version.

It was stated frequrntly that it needs to be possible to search for 
these things using reasonable search tools, which cannot identify 
"Bokm&#xE5;l".

>> or "Côte d'Ivoire",
>
> This should have an o-circumflex and a smart quote in it. If N'Ko gets 
> a smart quote

Others have argued that we need to follow the core standards, or risk 
slipping down a very slick slope.  And as you can see, that's exactly 
what we ended up doing.

>> nor on the question of splitting compound names like "Falkland 
>> Islands (Malvinas)" into separate Description fields.
>
> That was your idea and you withdrew it.

Or "Indus (Harappan)" or "Deseret (Mormon)" or "Han (Hanzi, Kanji, 
Hanja)".

>> 1.  adding a Description field with an ASCII apostrophe for every 
>> existing Description that contains a non-ASCII apostrophe-like 
>> character (punctuation, modifier letter, left-pointing, 
>> right-pointing, whatever), and
>
> Ugh.

The "smart" ones aren't going away.

>> 2.  removing the spurious apostrophe in "Amis".
>
> Good.

The only non-controversial duck in the pond.

>> Type: language
>> Subtag: gwi
>> Description: [RETAIN] Gwich&#xB4;in
>> Description: [ADD] Gwich'in
>
> I object. The grave is a *mistake*. That character is not used in 
> Gwich'in.

I assume it's not used in any language the way it's shown here.

> The correct character is U+02BC MODIFIER LETTER APOSTROPHE.

John Cowan replied:

> Fine.  Get the Library of Congress (as 639-2/RA) to fix it.

+1.  Especially since Håvard says the printed standard uses U+0027 
instead.

>> Type: script
>> Subtag: Ethi
>> Description: [RETAIN] Ethiopic (Ge&#x2018;ez)
>> Description: [ADD] Ethiopic (Ge'ez)
>
> I object. The correct character to use for this is U+02BB MODIFIER 
> LETTER TURNED COMMA

The "normative plain-text data file" on the ISO 15924/RA site:

http://www.unicode.org/iso15924/iso15924.txt.zip

as well as all four informative HTML pages:

http://www.unicode.org/iso15924/iso15924-codes.html
http://www.unicode.org/iso15924/iso15924-num.html
http://www.unicode.org/iso15924/iso15924-en.html
http://www.unicode.org/iso15924/iso15924-fr.html

all show U+2018.  Where am I supposed to find this?

--
Doug Ewell
Fullerton, California, USA
http://users.adelphia.net/~dewell/




More information about the Ietf-languages mailing list