Phonetic orthographies

Michael Everson everson at
Fri Nov 10 20:52:19 CET 2006

At 10:49 -0800 2006-11-10, Peter Constable wrote:

>  > The ISO 15924 RA has already received and rejected an application for
>>  a script tag for IPA. It does not meet the criteria established in
>>  ISO 15924. ... This is not my view only. It was the view of the RA.
>>  It is of course recognized that a tagging mechanism is needed, but
>>  ISO 15924 script codes are not not the way to do it.
>Perhaps the ISO 15924 RA would like to suggest a alternative solution to
>its user community in view of the request for a solution?

It's not the RA's job to do that, really. 
However, I (for my part) did suggest that the 
following might be used:

fonipa International Phonetic Alphabet
fonupa Uralic Phonetic Alphabet
fonweb Websters phonetic respelling (i-macron = [aj] etc)
fonami Americanist phonetic tradition
fonlep Lepsius' Standard Alphabet
fonmal Landsmaalalfabetet.
fondan Danish dialect alphabet
fornor Norwegian dialect alphabet

All of these are Latin-script orthographies which 
may be written to write any number of languages.

As an aside, consider too the following orthography tags:

monoton Greek Monotonic orthography
polyton Greek Polytonic orthography

>They may differ greatly from one another 
>formally, but in terms of function they clearly 
>form a group that unites them with one another 
>but differentiate them from Latin practical 
>orthographies in common use.

ISO 15924 is based on form. I understand that 
phonetic orthographies have a similar function, 
but they are still orthographies.

ISO 15924 distinguishes Latf and Latg from Latn 
because of form. ISO 15924 does not distinguish 
between Monotonic and Polytonic Greek because the 
former is just a subset of the latter, and the 
latter a superset of the former. IPA is a Latin 
alphabet with a lot of letters. Czech and 
Icelandic are Latin alphabets with lots of 
letters too.

>  > Personally I think this is bogus. Yes, there may be some unfamiliar
>>  letters in the extended alphabet. That depends greatly on the
>>  language. Look at the Finnish and Estonian examples in the 1949 IPA
>>  handbook. They hardly differ from standard orthography!
>But the functionality of phonetic transcriptions 
>is clearly distinct, and the desirability for a 
>user of getting content in phonetic 
>transcription vs. common practical orthography 
>is in general very real.

That still does not mean that IPA, or UPA, or 
Landsmålsalfabetet, or Webster's spelling, are 
scripts other than Latin. Nor does it mean that 
they belong to some collective variant of Latin, 
because to these can be added Cut Spelling and 
Leet and eni sort uv fonetik reespeling dhat 
igzists. I have a Cornish-German phrasebook here 
on my desk. It uses a Webster-like German-based 
respelling. Clearly this is a phonetic 
orthography. It's Latin.

>Latn-fonipa is no different from *Latp-fonipa in that case.
>True. However, by having the ability to make a 
>distinction in the script subtag allows for 
>processes to filter at a higher level. An 
>implementater might reasonably decide that, for 
>their application, variant-level distinctions 
>are unimportant (e.g. German in 1904 vs. 1996 
>orthography), while script-level distinctions 
>are (e.g. Azeri in Latin vs. Cyrillic 
>orthography). In those scenarios, phonetic 
>transcriptions would matter just as much as the 
>script-level distinctions, and assuming that the 
>implementer must treat them as variant-level 
>distinctions is forcing them to create 
>mechanisms to parse and process parts of the tag 
>that they only need consider for this one kind 
>of content and could otherwise ignore.
>It *does* solve something and would be very useful.

I understand that you have a problem because of 
the way that your parsing taxonomy works. I don't 
see how that translates into changing the intent 
of ISO 15924 into

What script is this in?

	crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.

It's Latin, isn't it?

The language is Georgian. It's taken from the 
1949 IPA handbook; the first line of The North 
Wind and the Sun. I cheated once, by suppressing 
one character (because I can't put it in my 
e-mail). In the final word, I have written <dz>, 
but in the IPA handbook the old U+01BB is used.

With U+01BB or with dz (which is now recommended 
by the IPA anyway), it's just plain old Latin. 
This is not even remotely like Latf or Latg.

>  > And of course there are many more. Each of these orthographies
>>  is Latn, though.
>Think of it like creating a filter for you email 
>inbox, and suppose these were tags in the 
>subject field: you'd be creating a bunch of 
>rules, one for each of these, with a need to 
>keep adding rules as you discovered more and 
>more cases; but the alternative would be having 
>one tag, Latp, that was getting used with all of 
>these, allowing you to write your rule once and 
>never need to update it. That analogy should 
>give you a partial picture of how #2 could be 
>useful and solve a need.

I comprehend what you are describing. I don't 
think that ISO standards should be, hm, abused in 
this way. *Latp is no different than, say an ISO 
639 tag *enc, taken to be a variety of "eng" 
'English' designed for use by speakers of 
varieties of "Commonwealth English" (en-GB, 
en-IE, en-ZA, en-AU, en-NZ) which may share many 
features and be difficult for speakers of other 
varieties of English to understand. It would make 
your filter much easier, but it would be the 
wrong thing to do.

I get to be the whipping boy for this, but the 
other members of the RA agreed with this 
assessment of a script tag for IPA. I'm sorry if 
you don't find it convenient. I offered a set of 
actual tags which could be used to describe a 
range of Latin phonetic orthographies, in the 
hope that it may assist you to solve the problem.
Michael Everson *

More information about the Ietf-languages mailing list