[YES] The Linguasphere proposal is suited to RFC 3066 (oritssuccessors) and its consuming protocols

Debbie Garside debbie at ictmarketing.co.uk
Mon Jun 7 18:51:19 CEST 2004


Off the cuff


Here is my response to a few of the questions/statements made during the
course of these discussions
 and I apologise for the repetition in advance

All that I say now is in rapid response, pending the return of David Dalby,
the architect of LS 639.

John Cowan sub-script

>The worst problem I see with the Linguasphere identifiers is the >extreme
difficulty of relating the more general to the less general, >as must be
done if requests are to be appropriately satisfied.  It may >make sense to
assign distinct 4-letter codes to such linguistic >entities as:

	English
	Hiberno-English
	Hiberno-English, spoken
	Hiberno-English, spoken in Dublin
	Hiberno-English, spoken in Dublin on the North Circular Road
	Hiberno-English, spoken in Dublin on the North Circular Road (south side)

>but a supplier of information that has content tagged with the last
>code will not be able to reply to a request for simply "English" >unless
>it grasps this particular branch of the entire system (which leads up
>to "Germanic" and "Indo-European" at higher levels, if I understand
>correctly).

>In order to do this, it must have the Linguasphere key (hierarchical
>identifier) corresponding to the 4-letter code, but this is (a) >unstable
>and (b) brittle, with its fixed maximum hierarchical depth of 8 and >its
>limited fanout of 10 to 26 siblings at each level.


The LS 639 referential scale would not be used for tagging data.  The
structure is as follows:

Each linguistic item within the Linguasphere is allocated a place within the
referential scale (Flexible).

Each linguistic item within the Linguasphere is allocated a category
number - usually fixed can be flexible - as it would have been with the
Serbia-Croatian situation (which I am sure David Dalby will explain if
required) which denotes where within the database it fits:  e.g. 40 for
Language, 41 for Language Variety, 42 for Component of Language Variety., 50
for language written variety, 51 for component of written variety etc. I can
see a very relevant use for this in cataloguing for library purposes so that
the data inputter is not faced with 000’s of codes.

Each linguistic item is allocated an alpha4 “langtag” (Fixed) COMPLETELY
FIXED
Each linguistic item is also allocated its PRECEDING alpha4 “langtag”
(Fixed – but could possibly be changed with any changes annotated) thus
forming the relationships between the “languages” and it is this that makes
it such a simple hierarchical world language map.  It is a simple relational
database based on the Linguasphere
Register 1999/2000 which is superb for the purpose."
This system means that the referential scale can be changed, giving the
required flexibility for the purpose of linguistics, whilst providing a
fixed system (alpha4 langtag) for coding purposes
Each aspect of the Linguasphere can be viewed/used in its own right or as
part of a hierarchy.  The system can be used as a “bare bones” system with
just “Language name” and alpha4 code or “Language Variety” and Alpha4 code
etc. or as the Linguasphere map using the preceding alpha4 langtag. It is
not recommended to use the referential scale for tagging purposes as, quite
rightly pointed out by John, this could/can change at any time with a
cascade update feature when other linguistic items are added (although this
will also be annotated within the system).

Sample Data (this is not all the fields obviously but merely the ones in
question here)

Layer		Referential Scale	Language Name		Alpha4	      Preceding
											Alpha4
40		00AAAAa		Bamanan.kan (Bambara)	bmnk		bmnn

NB This sample data does not display the current mapping with other
standards.

For the purposes of tagging just the static Alpha4 langtag is required, the
Linguasphere system does the rest.

AND
 the system will (and does already) quite easily map to other standards.


The crux of the matter seems to be focusing on the question of USE for a
system of such detailed granularity.  We can discuss the various
technicalities ad infinitum, but the system does work.

One thing I can say before answering the question on use is: Given 4 billion
IP addresses, who would have predicted a need for a greater range?

So
 to the question of USE
 and detailed granularity – please see:
http://www.linguasphere.com/grassroots.asp

Peter Constable sub-script

>and I *really* would like to
>see better analysis justifying the need. In the absence of such
>analysis, I'm not sure I could recommend to the US TAG that they vote in
>favour of accepting a NWIP.

Michael Everson sub-script

>We have the same concern in Ireland.

I hope the issues of use and granularity are clearer.

Having received several personal emails
 hate to do it again
 but one or two
from "eminent professors"
, and I have to say “thank you” to the United
Nations (cos even I was impressed by that) I am beginning to understand that
some people are experiencing a certain amount of trepidation towards
entering this forum (can’t think why).  Therefore, I invite private
questions to be sent to my personal email address where they will be treated
in the strictest confidence and answered by the appropriate person within
the organisation.

One final thank you
 to the 5 Corporate Directors and the Globalisation team
in Canada for  “taking time” and teaching me the true “value” of the
Linguasphere and to the European Vice-President and his staff for
facilitating the process.  The table is set and the Moules Mariniere are
cooking



Re: crystal balls and Euro 2004
 Sorry
 I’m a tennis fan

Crystal Ball says Henman for Wimbledon Champ
. Champagne, strawberries and
cream
 and the best of British sport 
 and
 if my crystal ball (currently
residing on my bedside cabinet) proves correct a ticket to the men’s singles
final would be gratefully received


Debbie

-----Original Message-----
From: ietf-languages-bounces at alvestrand.no
[mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Debbie Garside
Sent: 07 June 2004 10:43
To: Clay Compton; ietf-languages at iana.org
Subject: RE: [YES] The Linguasphere proposal is suited to RFC 3066
(oritssuccessors) and its consuming protocols


>"cy-cyde-prsl" is a perfectly valid tag in RFC 3066 today, it accurately
reflects that the tagged language variety is related to Welsh (which makes
it more aesthetically satisfying)

I agree... I'm not a programmer but that structure seems completely logical
to me...

I will show, later today, how the Linguasphere system can work in exactly
this way... I am compiling my response to the questions raised...

>If the implications of the proposal for RFC 3066 are to allow subtags based
on the language varieties and communities in the LS Register, this is an
occasion for wild celebration...

Be ready for wild celebration...

Debbie

-----Original Message-----
From: ietf-languages-bounces at alvestrand.no
[mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Clay Compton
Sent: 04 June 2004 22:17
To: ietf-languages at iana.org
Subject: [YES] The Linguasphere proposal is suited to RFC 3066 (or
itssuccessors) and its consuming protocols


Comments:

What can I say; maybe I just enjoy being contrary.  However, I think adding
*parts* of the Linguasphere proposal the RFC 3066 can be beneficial.  For
one thing, it would cut back on the number of custom tags requested in this
forum, which most RFC 3066 implementers don't seem to notice, anyway.
My continued support depends on how RFC 3066 gets extended to support the LS
639 tags.  Clearly, "ineu" (Indo-European) is not a language and should
never be used for tagging content.  By the same token, neither is "prsl"
(Preseli Welsh).  However, "cy-cyde-prsl" is a perfectly valid tag in RFC
3066 today, it accurately reflects that the tagged language variety is
related to Welsh (which makes it more aesthetically satisfying), and legacy
systems that parse the subtags in the tag (which they shouldn't do, but do
anyway) would correctly fall back to "cy".  If the implications of the
proposal for RFC 3066 are to allow subtags based on the language varieties
and communities in the LS Register, this is an occasion for wild
celebration.  Of course, I'd like to hear the Linguasphere folks pledge that
they'll avoid any tag name collisions with ISO 15924.
It's true that there would be a lot of tags in LS 639, but I'm not
complaining.  I think they (we) can handle the change as long as RFC 3066's
hypothetical successor has an
"LS639-tags-as-subtags-for-language-varieties-only" rule that generates tags
like the one I suggest above.

Clay Compton

-----Original Message-----
From: ietf-languages-bounces at alvestrand.no
[mailto:ietf-languages-bounces at alvestrand.no]On Behalf Of Misha Wolf
Sent: Friday, June 04, 2004 12:24 PM
To: ietf-languages at iana.org
Subject: [YES/NO] The Linguasphere proposal is suited to RFC 3066 (or its
successors) and its consuming protocols

Ooops.  This version is better :-)

Misha


-----Original Message-----
From: Misha Wolf
Sent: 04 June 2004 20:22
To: 'ietf-languages at iana.org'
Subject: The Linguasphere proposal is suited to RFC 3066 (or its
successors) and its consuming protocols -- [YES/NO]


I'd like to carry out an experiment and hope the list moderator
doesn't object.  This is based on a system Michael Sperberg-McQueen
used with the W3C XML Schema WG.  The WG had a vast number of
members and lots of decisions to make.  Sometimes email ballots
were used, with the question and the vote both placed in the
Subject line for automated processing.  I seem to recall that the
idea was that there was no need to read the mail itself, as the
only relevant information was in the Subject line.

If you agree with this experiment and have an opinion, please reply
to this mail, deleting either the "YES" or the "NO" from the Subject
line.

If you agree with this experiment and do not have an opinion, please
skip to the next mail in your Inbox.

If you do not agree with this experiment and want to write a mail
saying that it is a load of nonsense, please leave both the "YES"
and the "NO" in place.

Thanks

Misha Wolf
Standards Manager
Product and Platform Architecture Group
Reuters Limited


-----Original Message-----
From: Misha Wolf
Sent: 04 June 2004 19:47
To: ietf-languages at iana.org
Subject: RE: Linguasphere -- An appeal for clarity


Can we have a straw poll re Q2 ...?

   Does anyone here consider the Linguasphere stuff to be suited
   to RFC 3066* and its consuming protocols?

* or its successors

Misha Wolf
Standards Manager
Product and Platform Architecture Group
Reuters Limited


-----Original Message-----
From: ietf-languages-bounces at alvestrand.no
[mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Peter
Constable
Sent: 04 June 2004 19:41
To: ietf-languages at iana.org
Subject: RE: Linguasphere -- An appeal for clarity


> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Misha Wolf


> Please can we keep separate the discussions...

[in a subsequent message]

> Reading the various mails, I feel that people are
> arguing at cross-purposes.

Debbie has made comments on this list suggesting positive answers for
both questions. As I'm concerned about what happens re Q2 but also about
how this community perceives what's happening in the ISO arena (Q1 --
e.g. Harald's response to DG's message expressing concern by *too much*
activity related to ISO 639), I felt it was appropriate to put both
issues into appropriate context.

Re Q1, I have said that, at this time, the project Debbie is referring
to is not an ISO project, and that needs analysis has not been provided.

Re Q2, I have said that needs analysis has not been provided, and that I
am inclined to think a huge codeset at the level of granularity proposed
would not be a good thing for a successor of RFC 3066 and its consuming
protocols.


Peter

Peter Constable
Globalization Infrastructure and Font Technologies
Microsoft Windows Division
_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages


--------------------------------------------------------------- -
        Visit our Internet site at http://www.reuters.com

Get closer to the financial markets with Reuters Messaging - for more
information and to register, visit http://www.reuters.com/messaging

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.

_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages
_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages

_______________________________________________
Ietf-languages mailing list
Ietf-languages at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list