Variant tags for sl-rozaj: History and preliminaries Draft of message

CE Whitehead cewcathar at hotmail.com
Thu Jun 14 16:06:15 CEST 2007


Hi, +1 to Hans' request.

han.steenwijk at unipd.it han.steenwijk at unipd.it
Tue Jun 12 05:06:51 CEST 2007
wrote

>The Prefix fields enumerate
>exhaustively which language tags can be formed, which is not so very
>generative. Even the order of the variant tags within the language tag is
>determined by the Prefix fields.

. . .

There are already variants that are formed using just dates (1901 and 1996) 
in use for German, so I can see no reason to object to the 1994 variant[s] 
(if there is no likeliness of misuse of these subtags -- for example, cases 
of other languages with different orthographies which also have 1994 dates 
-- I do not know of such cases however).

Thanks for spelling the prefixes out clearly.

>With this information in the Prefix fields, the implementer can still
>generate "sl-IT-rozaj-1994" and "sl-IT-rozaj-biske-1994". However, this is
>highly redundant, as the whole of the Resian speech territory is contained
>within Italy.

Right; the IT subtag, as I understand it, since it is redundant, would not 
be used normally to form tags; if it were used, I understood that the search 
engines would normally match up sl-rozaj-1994 and sl-IT-rozaj-1994 ???

(However, Doug Ewell noted to me that, in fact, the two tags above, the 
first without [IT] and the second with [IT] might not be quite

"identical, from the standpoint of matching" because  "[t]here is no such 
thing as 'Suppress-Region' to tell a matching process that 'sl-rozaj-1994' 
and 'sl-IT-rozaj-1994' can be assumed to be equivalent."


There's more information on matching (from a draft, not a final document, 
http://tools.ietf.org/html/draft-ietf-ltru-matching-07 )
Apparently,

a request for documents tagged,

sl-rozaj-1994

will normally also turn up all documents tagged

sl-IT-rozaj-1994

However, a request for documents tagged,

sl-IT-rozaj-1994

will not turn up all documents tagged,

sl-rozaj-1994

"3.2. Filtering


   "Filtering is used to select the set of content that matches a given
   prefix.  It is called 'filtering' because this set of content may
   contain no items at all or it may return an arbitrary number of
   matching items--as many as match the language range used to specify
   the items . . .

   ". . . For example, if the language range
   is "de-CH", one might see matching content with the tag "de-CH-1996"
   but one will never see a match with the tag 'de'."

However, there is also something called "Scored Filtering" apparently, & 
which I do not completely understand, which would still allow the tags to 
have a scored match even if an exact match were not possible.

But this is a draft document; so I do not know these filtering methods are 
normally implemented or not.

)

--C. E. Whitehead
cewcathar at hotmail.com

_________________________________________________________________
Play games, earn tickets, get cool prizes. Play now–it's FREE! 
http://club.live.com/home.aspx?icid=CLUB_hotmailtextlink1



More information about the Ietf-languages mailing list