Suppress-Script candidates (was: Re: frr, fy, ngo, tt)

John Cowan cowan at ccil.org
Wed Sep 27 17:09:11 CEST 2006


Doug Ewell scripsit:

> Lines where LTRU has no script and CLDR has one are not something we 
> need to spend a great amount of time worrying about, unless there is a 
> genuine concern that people are going to start writing, say, 
> "ig-Latn-NG" and it won't match with "ig-NG".

Unfortunately, it is *precisely* that concern that got Suppress-Script:
into RFC 4646 in the first place.  So that is what we must get right.
When people are confronted with a "Script" drop-down menu, the instinct
will be to choose the correct answer rather than leaving it on default,
so without adequate Suppress-Script: information the result will indeed be
"ig-Latn-NG".

And I would be astonished if there wasn't historical Igbo writing with
the Arabic script, though it may be gone too long to worry about now.

> The ones where LTRU has one and CLDR has more than one are more 
> troubling to me, because it indicates we may have added a 
> Suppress-Script where we shouldn't have:
> 
> >lang mo  ltru Latn cldr Cyrl Latn
> >lang ms  ltru Latn cldr Arab Latn

Yes, I'd have to say we screwed up on those two.  Fortunately, Moldovan
is only a separate language for political reasons (it's really Romanian,
as even the Moldovan Academy of Science agrees), and "ms-arab" is pretty
thoroughly obsolete.

> >lang pa  ltru Guru cldr Arab Guru

On this one I think we have it right, and CLDR has conflated Panjabi
(aka Eastern Panjabi, also spelled "Pun-") with Lahnda, a macrolanguage
which encompasses among other languages Western Panjabi, which is written
with Arabic script.  The latter will be "lah-pnb" in 4646bis.

That said, all of Hindi, Arabic, Panjabi, and Lahnda form a dialect
continuum, with no sharp distinctions *except* the scripts.

> >lang tr  ltru Latn cldr Arab Latn

Here too, I think we got it right.  It's Ottoman Turkish ('ota') that
was written in Arabic script, but the distinctions between 'ota' and
'tr' are profound: not just orthography, but syntax and vocabulary too.
There are whole pages of Ottoman writings of which a modern Turk,
supposing he knew how to read Turkish in the Arabic script (no trivial
achievement) would recognize not one single word.

-- 
Your worships will perhaps be thinking          John Cowan
that it is an easy thing to blow up a dog?      http://www.ccil.org/~cowan
[Or] to write a book?
    --Don Quixote, Introduction                 cowan at ccil.org


More information about the Ietf-languages mailing list