Question on ISO-639:1988

Thu Jun 3 16:12:42 CEST 2004

Lee Gillam wrote:

> anthropology, e.g. studies of speech. Jeremy Carroll's identification 
> of the need for use-cases, and how they would assist specific communities
> of users is one that should be taken up - perhaps by all parties involved in
> the definition of 639? Towards some kind of roadmap for adoption
> or the like?

RFC 3066bis:

http://www.ietf.org/internet-drafts/draft-phillips-langtags-02.txt

FYGI I have a report in preparation that identifies some use cases for 
RFC 3066bis in relationship to the Semantic Web (my main area of work). 
(Co-authored with Addison)

Here they are:

[[
2.1	Appropriate Display of Labels
This use case is described in the OWL Requirements [2]: a Semantic Web 
application has data to be displayed to an end user. Many of the 
resources in the knowledge base are to be displayed using one of the 
values of rdfs:label, selected to make a good match between the 
linguistic capabilities of the end-user and the language tag associated 
with that particular text string. A simplified example of such labels 
taken from the OWL Test Cases [6] is:
  <owl:Class rdf:ID="ShakespearePlay">
   <rdfs:label xml:lang="it">Opere di Shakespeare</rdfs:label>
    <rdfs:label rdf:parseType="Literal"><span
      xml:lang="ja">????????<ruby>
       <rbc><rb>?</rb><rb>?</rb></rbc>
       <rtc><rt>??</rt><rt>??</rt></rtc>
      </ruby></span></rdfs:label>
  </owl:Class>
We consider a specific end user: Brian who is a mother tongue English 
speaker, with a good knowledge of Japanese, can read Kanji, and hence 
can make some sense of any language written in traditional Chinese 
characters; he still remembers some of his schoolboy French.

2.2	Finding all Klingon text in a knowledge base

A Star trek fan wishes to search an RDF knowledge base for all the 
Klingon text in it, and then to explore the knowledge base from these 
resources.

2.3	Multilingual Knowledge Base Construction

An open source Semantic Web knowledge base is developed. The project 
started in the US, and all the natural language text strings in it have 
been tagged as “en-US” (following RDF Concepts [1] and RFC3066bis [3]). 
Other plain literals, with text that is not intended as natural language 
are marked up with the empty language tag. Gradually groups of Chinese 
developers (some from the mainland and other groups from Taiwan) become 
involved. Their typical interest is in using some subset of the 
knowledge base, possibly with some additional axioms. Moreover, each 
group has a specific application in mind, involving specific queries 
which return literal values for end-user presentation. Also, depending 
on the intended users of the application, the presented text must be 
available in English or traditional Chinese or simplified Chinese or 
some combination. Clearly, some of the developers will need to add 
traditional Chinese literals corresponding to the original US English 
literals; others will need to add simplified Chinese; but precisely when 
it is necessary to translate which literals can only really be 
determined by asking an OWL reasoner the relevant queries and comparing 
the results for the various natural languages.
]]

The first and the third are intended to go beyond the granularity 
available in simple systems like 639-1 but clearly do not use the 
granularity envisaged by the linguasphere, and I think would be 
significantly harder to implement with linguasphere type approach over 
and above RFC 3066bis.

The second is just a joke really, trying to make the work more 
intelligible to monolinguals who do not care about internationalization 
issues. (It's there to illustrate how the concept of grandfathering in 
RFC 3066bis maps onto deprecation in OWL)

Jeremy