Phonetic orthographies

Fri Nov 10 22:17:51 CET 2006

From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Michael Everson

>>Perhaps the ISO 15924 RA would like to suggest a alternative solution to
>>its user community in view of the request for a solution?
>
> It's not the RA's job to do that, really. 

It *is* the RAs job to register tags that users want to use, and to service the user needs for which ISO 15924 was created. If the RA does't feel a particular user need should be met using the standard when users are suggesting that it should, then IMO the RA should be prepared to suggest where an alternative solution might lie. Just the the ISO 639 JAC needs to be prepared to do.

> However, I (for my part) did suggest that the 
> following might be used:

Yes, but users are saying these alone are not considered sufficient for the needs, and you have not provided a solution to that extent.

>>They may differ greatly from one another 
>>formally, but in terms of function they clearly 
>>form a group that unites them with one another 
>>but differentiate them from Latin practical 
>>orthographies in common use.
>
> ISO 15924 is based on form.

Well, let's consider this. Is Fraser a subset of Latin or separate script? In terms of form, it is very clearly a subset of Latin, yet I believe I've heard you say it must be considered a separate script because of its unicameral behaviour. Phonetic transcriptions -- certainly those I'm familiar with -- are absolutely unicameral. (E.g. in Americanist, "a" and "A" represent distinct sounds.) So, by that line of reasoning, you ought equally to consider phonetic transcriptions separate scripts. I think we'd all agree that that's not where we want to go. But I suggest to you it ought to be enough to say that phonetic transcriptions based on Latin have some distinctive behaviour that warrants considering them a script variant.

>>But the functionality of phonetic transcriptions 
>>is clearly distinct, and the desirability for a 
>>user of getting content in phonetic 
>>transcription vs. common practical orthography 
>>is in general very real.
>
> That still does not mean that IPA, or UPA, or 
> Landsmålsalfabetet, or Webster's spelling, are 
> scripts other than Latin. Nor does it mean that 
> they belong to some collective variant of Latin

I think you are too swayed by an academic, graphology perspective and have lost site of the fact that ISO 15924 exists NOT as a form of academic documentation but rather to serve practical IT purposes. (I find this very reminiscent of the es-americas issue: you opposed it because it didn't fit your understanding of dialectology when you were missing the very real practical IT need.)

> I understand that you have a problem because of 
> the way that your parsing taxonomy works. I don't 
> see how that translates into changing the intent 
> of ISO 15924 into

So, let's revisit the intent:

"The codes were devised for use in terminology, lexicography, bibliography, and linguistics, but they may be used for any application requiring the expression of scripts in coded form. This International Standard also includes guidance on the use of script codes in some of these applications."

Again, you've got users saying that they have a need -- including in lexicography and linguistics -- to code Latin-based phonetic transcriptions as a script variant. The intent of the standard is to code just such things, and to provide usage guidance. Please encode "Latp", or please provide guidance as to how the practical need can be better met.

> What script is this in?
>
>	crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.
>
> It's Latin, isn't it?

Yes; and note the complete in appropriateness of

	Crdiloetis kari da mza k'amatobden tu romeli iqo upro dzlieri.

The capitalization has just turned this content into some completely different "orthography" with no known usage. Clearly this is Latin, but with exceptional rules -- i.e. a distinct variant of Latin.

> I comprehend what you are describing. I don't 
> think that ISO standards should be, hm, abused in 
> this way.

This is not an abuse but a very reasonable and practical IT application. It can only be seen as an abuse if you insist of thinking of the intent of the standard as being to provide academic documentation of scripts, or if you find a much better way to engineer solutions to the IT needs. Again, the RA has not done the latter, so I must assume the RA is doing the former, which is deviating from the intent of the standard.

> *Latp is no different than, say an ISO 
> 639 tag *enc, taken to be a variety of "eng" 
> 'English' designed for use by speakers of 
> varieties of "Commonwealth English" (en-GB, 
> en-IE, en-ZA, en-AU, en-NZ) which may share many 
> features and be difficult for speakers of other 
> varieties of English to understand. It would make 
> your filter much easier, but it would be the 
> wrong thing to do.

I think a much closer analogy would be an ISO 639 ID zh that encompasses yue, cmn, etc. And ISO 639 does encode zh.

Peter