el-latn, ru-latn, and related possibilities

Tex Texin tex at yahoo-inc.com
Wed Oct 5 22:05:25 CEST 2005


yes, you are right- I spaced out.
should be zh-Latn-pinyin
thanks Randy.

understand what I mean, not what I write... ( we need a code for that.)

Although, as I think about it, I am under the impression that phoneticists
often create their own characters. For some systems, script codes would be
incorrect and would require either a new script code to accommodate the
invented characters (or characters from more than one script) or perhaps
would be better served simply by a transliteration code...

(but that is a question put forth as an assertion for you all to correct) 


Tex Texin
Internationalization Architect,   Yahoo! Inc.
 
 


> -----Original Message-----
> From: Randy Presuhn [mailto:randy_presuhn at mindspring.com] 
> Sent: Wednesday, October 05, 2005 12:45 PM
> To: Tex Texin
> Subject: Re: el-latn, ru-latn, and related possibilities
> 
> 
> Hi -
> 
> pinyin is a latin-alphabet transcription, so zh-hans-pinyin
> (or zh-hant-pinyin) make no sense.  From the pinyin, there
> is no way to know whether something was written using hans
> or hant.
> 
> Randy
> 
> ----- Original Message ----- 
> From: "Tex Texin" <tex at yahoo-inc.com>
> To: "'Peter Constable'" <petercon at microsoft.com>; "'IETF 
> Languages Discussion'" <ietf-languages at iana.org>
> Sent: Wednesday, October 05, 2005 12:24 PM
> Subject: RE: el-latn, ru-latn, and related possibilities
> 
> 
> Guys, sorry to be the odd man out yet again, but we should 
> first run through all the use cases before deciding that 
> transliteration can be pushed down the stack. This argument 
> sounds to me more like a rationalization for continuing with 
> 3066bis than to really address the question.
> 
> Text to voice is important for accessibility. Identification 
> of the transliteration scheme would be a prominent 
> requirement and perhaps therefore ru-Latn is not sufficient 
> and should not be recommended as adequate.
> 
> Also, if we buy the argument that script was important enough 
> to break compatibility with lang-region, and to instead 
> associate script with language as lang-script-region, I would 
> think we would want transliteration to also be tied with 
> script and not go after region.
> 
> Something like zh-hans-pinyin-cn rather than zh-hans-cn-pinyin.
> 
> What am I missing?
> (yeah, yeah, marbles I know...)
> 
> 
> Tex Texin
> Internationalization Architect,   Yahoo! Inc.
> 
> 
> 
> 
> > -----Original Message-----
> > From: ietf-languages-bounces at alvestrand.no
> > [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Peter 
> > Constable
> > Sent: Sunday, October 02, 2005 10:35 PM
> > To: IETF Languages Discussion
> > Subject: RE: el-latn, ru-latn, and related possibilities
> >
> >
> > > From: ietf-languages-bounces at alvestrand.no 
> [mailto:ietf-languages- 
> > > bounces at alvestrand.no] On Behalf Of John.Cowan
> >
> >
> > > > The question arises is as follows: just as de-de-1901 
> and all the
> > > similar
> > > > subtagging exists for German, might there be advantages in 
> > > > registrations (or alternatively a mechanism which avoided
> > the need
> > > > for registrations) which listed widely used 
> transliterations into 
> > > > Latin?
> > >
> > > We haven't yet settled whether (when we are fully in the 
> RFC 3066bis
> > > regime) we should handle transliterations as simple 
> variants (like 
> > > historical orthographies, geographical and social 
> dialects, and the
> > > like) or via the new extension machinery of RFC 3066bis,
> > which is more
> > > work to set up but is more general-purpose.  In any case, someone 
> > > would have to maintain a registry of transliterations,
> > transcriptions,
> > > and orthographies: no one has taken on that job yet.
> >
> > IMO, these should be variants and not extensions. Treating them as 
> > extensions would require a separate specification and 
> entail that they 
> > would not be supported in protocols and specifications that 
> reference 
> > RFC 3066bis unless specifically revised to reference the 
> extension RFC 
> > as well -- which would be an incredible pain. So, for 
> instance, while 
> > there's reasonable likelihood of expecting XML would be updated to
> > reference RFC 3066bis, so that those tags could be used for
> > xml:lang, it's far less likely that a extension RFC would be
> > referenced by XML, meaning that transliterations could not be
> > distinguished in XML lang.
> >
> > I'm completely convinced that transliterated text in some language 
> > such as Russian can simply be treated as Russian-language 
> data that is 
> > written in an alternate written form. For some purposes, 
> tagging it as 
> > "ru" may be sufficient; for others, tagging it as "ru-Latn" may be
> > needed, and for some purposes tagging it as (say)
> > "ru-Latn-iso9r95" (assuming ISO 9:1995) may be appropriate.
> > This nicely allows for useful degrees of specificity, and is
> > completely adequate for indicating the distinguishing
> > characteristics of the language and written form of the data.
> >
> > There's absolutely no reason I can see for complicating this either 
> > syntactically with an extension and its corresponding singleton, or 
> > procedurally by requiring a completely separate 
> registration process 
> > to be set up. The only possible benefit of an extension 
> would be the 
> > possibility of creating some means for generative creating of tags 
> > that allow for transliteration standards from some defined group of 
> > sources, such as ISO, but I think the potential benefits are limited
> > and the cost high.
> >
> > My vote, therefore, would be to simply treat specific 
> transliteration 
> > schemes via the variant subtag as defined in RFC 3066bis.
> >
> >
> > Peter Constable
> >
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no 
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> 
> 
> 
> 



More information about the Ietf-languages mailing list