Phonetic orthographies
Michael Everson
everson at evertype.com
Sat Nov 11 13:02:39 CET 2006
At 13:17 -0800 2006-11-10, Peter Constable wrote:
>>>Perhaps the ISO 15924 RA would like to suggest
>>>a alternative solution to its user community
>>>in view of the request for a solution?
>>It's not the RA's job to do that, really.
>It *is* the RAs job to register tags that users
>want to use, and to service the user needs for
>which ISO 15924 was created.
Which needs were not, I should think, fixing
problems in parsing software having to do with
hierarchies which were not in existence when the
standard was being drafted. The user needs for
which ISO 15924 was created were to identify the
names of scripts with four-letter codes. I
recognize that you have a problem. You want to be
able to tag a run of text in some way so that,
for instance, voice software could read it out.
That's a nice application. I do not, however,
believe that this is a matter of "script code".
Because IPA uses the Latin script, it is a matter
of orthography, and is therefore the solution
should reside in the realm of "language tag" as
other orthographic distinctions do.
>If the RA does't feel a particular user need
>should be met using the standard when users are
>suggesting that it should, then IMO the RA
>should be prepared to suggest where an
>alternative solution might lie.
The RA doesn't agree that bogus script codes
should be entered into the registry, and "Latp"
or "Ipaa" are both bogus. The script is Latin.
Its shape is Roman, not Frakur or Gaelic or
anything else unrecognizable. I'm sorry if you
don't like it. I'm only one member of the RA.
I have suggested a set of orthographic tags which
as far as I can see could suffice. Oxford
spelling may be described as "en-GB-oed". IPA
transcription might just as easily be described
as "en-GB-fonipa".
>Just the the ISO 639 JAC needs to be prepared to do.
>
> > However, I (for my part) did suggest that the following might be used:
>
>Yes, but users are saying these alone are not
>considered sufficient for the needs, and you
>have not provided a solution to that extent.
So you're saying that nothing but a bogus script
code will satisfy you. I don't see how this is
the fault of the RA. Some users of the UCS think
that only 5000 precomposed Tibetan syllables will
satisfy them. But that's not the fault of WG2 or
the UTC.
> > ISO 15924 is based on form.
>
>Well, let's consider this. Is Fraser a subset of Latin or separate script?
It is a separate script.
>In terms of form, it is very clearly a subset of
>Latin, yet I believe I've heard you say it must
>be considered a separate script because of its
>unicameral behaviour.
Its origins are Latin, but its behaviour and its
shapes are not. Font variation (deviation from
Helvetica/Arial letterforms), for instance, is
either unheard of or extremely rare.
"Lower-cased" Fraser is *illegible* to Lisus.
Upper-cased IPA is not conventional, (and some
letters do not have upper-case versions encoded)
but that does not mean that it is illegible, and
indeed as we have seen such a development is
natural as in African orthographies. Please also
note the use of capitalization in IPA text on
pages 51-52 of the 1949 IPA Handbook. Doubtless
there are other texts extant which make use of
this convention.
>Phonetic transcriptions -- certainly those I'm
>familiar with -- are absolutely unicameral.
crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.
Crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.
CRDILOETIS KARI DA MZA K'AMATOBDEN TU ROMELI IQO UPRO DZLIERI.
All three are Latin. But one is also something
other than Latin? I'm sorry. I don't accept that.
>(E.g. in Americanist, "a" and "A" represent
>distinct sounds.) So, by that line of reasoning,
>you ought equally to consider phonetic
>transcriptions separate scripts.
I think I know what a script is. I do not believe
that IPA is a separate script from Latin. It is
an orthography using Latin letters.
>I think we'd all agree that that's not where we
>want to go. But I suggest to you it ought to be
>enough to say that phonetic transcriptions based
>on Latin have some distinctive behaviour that
>warrants considering them a script variant.
No, because again it is a matter of orthography.
Some African orthographies began as IPA
transcriptions. As you are well aware, capital
forms of the "new" letters are quickly devised by
people as soon as they standardize the
transcription into orthography. Runs of text
which do not contain personal names or begin
sentences are therefore IDENTICAL with IPA
transcription. How does this merit a separate
script code?
>>That still does not mean that IPA, or UPA, or
>>Landsmålsalfabetet, or Webster's spelling, are
>>scripts other than Latin. Nor does it mean that
>>they belong to some collective variant of Latin
>
>I think you are too swayed by an academic,
>graphology perspective and have lost [sight] of
>the fact that ISO 15924 exists NOT as a form of
>academic documentation but rather to serve
>practical IT purposes. (I find this very
>reminiscent of the es-americas issue: you
>opposed it because it didn't fit your
>understanding of dialectology when you were
>missing the very real practical IT need.)
You may, but I (personally) devised ISO 15924 in
the first place, and edited it from beginning to
end, so I *might* be expected to know what it is
for. It is a standard for the identification of
the names of scripts. "Latin Phonetic" isn't the
name of a script. "Phonetic transcription" isn't
the name of a script. "IPA" doesn't trump the
hundreds of other phonetic transcriptions out
there and deserve its own script code while all
of the others do not. (The Spanish example is not
analogous.)
This is not only my opinion. The RA rejected a
proposal already for "Ipaa": "The IPA is a set of
Latin letters, and can be represented by Latn. It
is an orthography of Latin, not a script of its
own."
>Again, you've got users saying that they have a
>need -- including in lexicography and
>linguistics -- to code Latin-based phonetic
>transcriptions as a script variant.
I recognize that there is a need to identify runs
of text as IPA orthography. I do not accept that
the distinction is one of *script*; it is indeed
a distinction of *orthography*.
>The intent of the standard is to code just such
>things, and to provide usage guidance. Please
>encode "Latp", or please provide guidance as to
>how the practical need can be better met.
Your *wanting* them to be a script variant does
not *make* them a script variant. You have not
convinced me. You seem to want "Latp" to be some
sort of macro-script that would encompass Webster
and UPA and IPA and the rest in one family. But
"Latn" does this already, and Webster and UPA and
IPA are just orthographies.
>>What script is this in?
>>
>> crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.
>>
>>It's Latin, isn't it?
>
>Yes; and note the complete in appropriateness of
>
> Crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.
This is inappropriate in what way? It is natural.
It happens in African and North American
languages quite regularly, and you and I both
have often proposed to encode missing capital
letters to support such development.
>The capitalization has just turned this content
>into some completely different "orthography"
>with no known usage. Clearly this is Latin, but
>with exceptional rules -- i.e. a distinct
>variant of Latin.
So "crdiloetis kari da mza" is ambiguous as to
being Latp or Latn but "Crdiloetis kari da mza"
is not? This... forgive me... is preposterous, in
my view.
>>I comprehend what you are describing. I don't
>>think that ISO standards should be, hm, abused
>>in this way.
>
>This is not an abuse but a very reasonable and
>practical IT application. It can only be seen as
>an abuse if you insist of thinking of the intent
>of the standard as being to provide academic
>documentation of scripts, or if you find a much
>better way to engineer solutions to the IT
>needs. Again, the RA has not done the latter, so
>I must assume the RA is doing the former, which
>is deviating from the intent of the standard.
The RA has said that the distinction is one of
orthography and not one of script. I have
endeavoured to address the requirement by
proposing orthography tags. I am quite confident
that if you
>>*Latp is no different than, say an ISO 639 tag
>>*enc, taken to be a variety of "eng" 'English'
>>designed for use by speakers of varieties of
>>"Commonwealth English" (en-GB, en-IE, en-ZA,
>>en-AU, en-NZ) which may share many features and
>>be difficult for speakers of other varieties of
>>English to understand. It would make your
>>filter much easier, but it would be the wrong
>>thing to do.
>
>I think a much closer analogy would be an ISO
>639 ID zh that encompasses yue, cmn, etc. And
>ISO 639 does encode zh.
I do not think you have understood what I wrote, but perhaps it is moot.
For my part, I do try to do my job with due
diligence, and I have proposed a set of
appropriate orthography subtags. Please
investigate the possibilities of making software
which is able to make use of such tags in order
to identify phonetic orthographies.
--
Michael Everson * http://www.evertype.com
More information about the Ietf-languages
mailing list