Question on ISO-639:1988
Jeremy Carroll
jjc at hplb.hpl.hp.com
Thu Jun 3 16:12:42 CEST 2004
Lee Gillam wrote:
> anthropology, e.g. studies of speech. Jeremy Carroll's identification
> of the need for use-cases, and how they would assist specific communities
> of users is one that should be taken up - perhaps by all parties involved in
> the definition of 639? Towards some kind of roadmap for adoption
> or the like?
RFC 3066bis:
http://www.ietf.org/internet-drafts/draft-phillips-langtags-02.txt
FYGI I have a report in preparation that identifies some use cases for
RFC 3066bis in relationship to the Semantic Web (my main area of work).
(Co-authored with Addison)
Here they are:
[[
2.1 Appropriate Display of Labels
This use case is described in the OWL Requirements [2]: a Semantic Web
application has data to be displayed to an end user. Many of the
resources in the knowledge base are to be displayed using one of the
values of rdfs:label, selected to make a good match between the
linguistic capabilities of the end-user and the language tag associated
with that particular text string. A simplified example of such labels
taken from the OWL Test Cases [6] is:
<owl:Class rdf:ID="ShakespearePlay">
<rdfs:label xml:lang="it">Opere di Shakespeare</rdfs:label>
<rdfs:label rdf:parseType="Literal"><span
xml:lang="ja">????????<ruby>
<rbc><rb>?</rb><rb>?</rb></rbc>
<rtc><rt>??</rt><rt>??</rt></rtc>
</ruby></span></rdfs:label>
</owl:Class>
We consider a specific end user: Brian who is a mother tongue English
speaker, with a good knowledge of Japanese, can read Kanji, and hence
can make some sense of any language written in traditional Chinese
characters; he still remembers some of his schoolboy French.
2.2 Finding all Klingon text in a knowledge base
A Star trek fan wishes to search an RDF knowledge base for all the
Klingon text in it, and then to explore the knowledge base from these
resources.
2.3 Multilingual Knowledge Base Construction
An open source Semantic Web knowledge base is developed. The project
started in the US, and all the natural language text strings in it have
been tagged as en-US (following RDF Concepts [1] and RFC3066bis [3]).
Other plain literals, with text that is not intended as natural language
are marked up with the empty language tag. Gradually groups of Chinese
developers (some from the mainland and other groups from Taiwan) become
involved. Their typical interest is in using some subset of the
knowledge base, possibly with some additional axioms. Moreover, each
group has a specific application in mind, involving specific queries
which return literal values for end-user presentation. Also, depending
on the intended users of the application, the presented text must be
available in English or traditional Chinese or simplified Chinese or
some combination. Clearly, some of the developers will need to add
traditional Chinese literals corresponding to the original US English
literals; others will need to add simplified Chinese; but precisely when
it is necessary to translate which literals can only really be
determined by asking an OWL reasoner the relevant queries and comparing
the results for the various natural languages.
]]
The first and the third are intended to go beyond the granularity
available in simple systems like 639-1 but clearly do not use the
granularity envisaged by the linguasphere, and I think would be
significantly harder to implement with linguasphere type approach over
and above RFC 3066bis.
The second is just a joke really, trying to make the work more
intelligible to monolinguals who do not care about internationalization
issues. (It's there to illustrate how the concept of grandfathering in
RFC 3066bis maps onto deprecation in OWL)
Jeremy
More information about the Ietf-languages
mailing list