OT: JFC's language identifier ideas (was Distinguishing Greek, etc.)
addison.phillips at quest.com
Thu Mar 17 18:39:14 CET 2005
This discussion should be on the LTRU list and not here. I have not copied LTRU because I don't wish to start a two-list thread.
I still don't know what you're going on about, JFC, regarding "five attributes" in language identification. In my copy of MS Word 2003 I find that languages are identified by names that match a language-country or language-script duality. For various reasons, when language tags are emitted in RFC 3066 format (such as when you save a document as HTML), the language-script tags lose their script part (because these tags were not registered before Mark Davis proposed them last year and clearly Word 2003 was released beforehand--hooray for a vendor that pays attention to standards).
Internally, MS Windows and the .NET platform use identifiers that *do* include both country and script for those cases where distinction is necessary. Witness Microsoft's support for the registration of various other language-script-country combinations...
As for dictionaries and orthographies, my copy of Word has a few minor tweaks that can be applied to my default dictionary and I can select a specific dictionary to use for spell checking (identified by language-country or language-script again and occasionally by country which I take to mean language-country). But then, I also have tabs for "Asian Typography", "Japanese Find" and "Complex Scripts"--which is great for me and meaningless to many of my fellow Americans (mostly a monolingual lot). I fail to see the attributes you wish to cite in the dialogs you reference.
So instead of citing five attributes possibly used by someone else's product and expecting us to guess what they are, why don't you enumerate what you think the five attributes actually are (or should be) so that we can debate them (on the LTRU list, as Martin asked, in the manner that Martin asked it)?
In other words, if you have a constructive proposal, make it.
PS> If you write to me off list you are warned that I will copy the appropriate list with my response, assuming I do not ignore you.
Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group
Internationalization is not a feature.
It is an architecture.
> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of JFC (Jefsey) Morfin
> Sent: Thursday, March 17, 2005 9:11 AM
> To: Peter Constable; IETF Languages Discussion
> Subject: RE: Distinguishing Greek and Greek
> At 15:24 17/03/2005, Peter Constable wrote:
> > > From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> > > bounces at alvestrand.no] On Behalf Of JFC (Jefsey) Morfin
> >First, I didn't realize that Word was typically used as a Web browser.
> >Secondly, I would guess that it could display 99% of all the text on the
> >Web encoded in Unicode or other industry standard encodings.
> Dear Peter,
> I am not sure of what you mean here. Why do you talk of web browser? And
> yes, Word can read pages on the web, but?
> Anyway, my version of Word supports a list of possible 72 languages
> including 13 versions of English. "Possible" because they are supported
> only if I load them.
> > > Real life may be more complex than 3 descriptors language tags. (BTw
> > > is why we need 5 of them (which can default to 3 when we know 2, or
> > > to
> > > 1, when we know 4). But Word uses 5 and Microsoft has a good proven
> > > experience in the area).
> >Microsoft does not use 5-letter language identifiers.
> I do not know what you name a 5-letter language identifier?
> Word defines a language through 5 attributes you can find in:
> - Tools - languages: languages, script and country (however they do not
> have country and script at the same time)
> - Tools - options - grammar and orthography: dictionaries, rules of style
> (I translate for French, the terms on your copy may be different).
> Then you can customize the rules of style. If you find an other key
> attribute thank you to let me know.
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
More information about the Ietf-languages