el-latn, ru-latn, and related possibilities
tex at yahoo-inc.com
Wed Oct 5 22:05:25 CEST 2005
yes, you are right- I spaced out.
should be zh-Latn-pinyin
understand what I mean, not what I write... ( we need a code for that.)
Although, as I think about it, I am under the impression that phoneticists
often create their own characters. For some systems, script codes would be
incorrect and would require either a new script code to accommodate the
invented characters (or characters from more than one script) or perhaps
would be better served simply by a transliteration code...
(but that is a question put forth as an assertion for you all to correct)
Internationalization Architect, Yahoo! Inc.
> -----Original Message-----
> From: Randy Presuhn [mailto:randy_presuhn at mindspring.com]
> Sent: Wednesday, October 05, 2005 12:45 PM
> To: Tex Texin
> Subject: Re: el-latn, ru-latn, and related possibilities
> Hi -
> pinyin is a latin-alphabet transcription, so zh-hans-pinyin
> (or zh-hant-pinyin) make no sense. From the pinyin, there
> is no way to know whether something was written using hans
> or hant.
> ----- Original Message -----
> From: "Tex Texin" <tex at yahoo-inc.com>
> To: "'Peter Constable'" <petercon at microsoft.com>; "'IETF
> Languages Discussion'" <ietf-languages at iana.org>
> Sent: Wednesday, October 05, 2005 12:24 PM
> Subject: RE: el-latn, ru-latn, and related possibilities
> Guys, sorry to be the odd man out yet again, but we should
> first run through all the use cases before deciding that
> transliteration can be pushed down the stack. This argument
> sounds to me more like a rationalization for continuing with
> 3066bis than to really address the question.
> Text to voice is important for accessibility. Identification
> of the transliteration scheme would be a prominent
> requirement and perhaps therefore ru-Latn is not sufficient
> and should not be recommended as adequate.
> Also, if we buy the argument that script was important enough
> to break compatibility with lang-region, and to instead
> associate script with language as lang-script-region, I would
> think we would want transliteration to also be tied with
> script and not go after region.
> Something like zh-hans-pinyin-cn rather than zh-hans-cn-pinyin.
> What am I missing?
> (yeah, yeah, marbles I know...)
> Tex Texin
> Internationalization Architect, Yahoo! Inc.
> > -----Original Message-----
> > From: ietf-languages-bounces at alvestrand.no
> > [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Peter
> > Constable
> > Sent: Sunday, October 02, 2005 10:35 PM
> > To: IETF Languages Discussion
> > Subject: RE: el-latn, ru-latn, and related possibilities
> > > From: ietf-languages-bounces at alvestrand.no
> > > bounces at alvestrand.no] On Behalf Of John.Cowan
> > > > The question arises is as follows: just as de-de-1901
> and all the
> > > similar
> > > > subtagging exists for German, might there be advantages in
> > > > registrations (or alternatively a mechanism which avoided
> > the need
> > > > for registrations) which listed widely used
> transliterations into
> > > > Latin?
> > >
> > > We haven't yet settled whether (when we are fully in the
> RFC 3066bis
> > > regime) we should handle transliterations as simple
> variants (like
> > > historical orthographies, geographical and social
> dialects, and the
> > > like) or via the new extension machinery of RFC 3066bis,
> > which is more
> > > work to set up but is more general-purpose. In any case, someone
> > > would have to maintain a registry of transliterations,
> > transcriptions,
> > > and orthographies: no one has taken on that job yet.
> > IMO, these should be variants and not extensions. Treating them as
> > extensions would require a separate specification and
> entail that they
> > would not be supported in protocols and specifications that
> > RFC 3066bis unless specifically revised to reference the
> extension RFC
> > as well -- which would be an incredible pain. So, for
> instance, while
> > there's reasonable likelihood of expecting XML would be updated to
> > reference RFC 3066bis, so that those tags could be used for
> > xml:lang, it's far less likely that a extension RFC would be
> > referenced by XML, meaning that transliterations could not be
> > distinguished in XML lang.
> > I'm completely convinced that transliterated text in some language
> > such as Russian can simply be treated as Russian-language
> data that is
> > written in an alternate written form. For some purposes,
> tagging it as
> > "ru" may be sufficient; for others, tagging it as "ru-Latn" may be
> > needed, and for some purposes tagging it as (say)
> > "ru-Latn-iso9r95" (assuming ISO 9:1995) may be appropriate.
> > This nicely allows for useful degrees of specificity, and is
> > completely adequate for indicating the distinguishing
> > characteristics of the language and written form of the data.
> > There's absolutely no reason I can see for complicating this either
> > syntactically with an extension and its corresponding singleton, or
> > procedurally by requiring a completely separate
> registration process
> > to be set up. The only possible benefit of an extension
> would be the
> > possibility of creating some means for generative creating of tags
> > that allow for transliteration standards from some defined group of
> > sources, such as ISO, but I think the potential benefits are limited
> > and the cost high.
> > My vote, therefore, would be to simply treat specific
> > schemes via the variant subtag as defined in RFC 3066bis.
> > Peter Constable
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
More information about the Ietf-languages