Revised I-D: draft-alvestrand-content-language-03
Harald Tveit Alvestrand
harald@alvestrand.no
Fri, 15 Feb 2002 08:36:25 -0800
--==========325820379==========
Content-Type: text/plain; charset=us-ascii; FORMAT=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Internet-drafts,
please publish this.
ietf-languages and Bruce Lilly: Please review.
The main change is in the ABNF of the headers; all are now specified in RFC
2282 ABNF, and use the "obs-" formalism from RFC 2822 to specify accept and
generate grammars.
Thanks!
Harald
--==========325820379==========
Content-Type: text/plain; charset=iso-8859-1; name="content-language-03.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="content-language-03.txt";
size=16120
Internet-Draft H. =
Alvestrand=20
draft-alvestrand-content-language-03.txt Cisco =
Systems=20
Target Category: Standards Track =
February 2002=20
Updates: RFC 1766 Expires: =
August 2002=20
=20
=20
=20
=20
=20
=20
=20
=20
=20
Content Language Headers=20
=20
Status of this Memo=20
The file name of this memo is =
draft-alvestrand-content-language-
03.txt=20
This document is an Internet-Draft and is in full =
conformance with=20
all provisions of Section 10 of RFC 2026.=20
Internet-Drafts are working documents of the Internet =
Engineering=20
Task Force (IETF), its areas, and its working groups. =
Note that=20
other groups may also distribute working documents as =
Internet-
Drafts.=20
Internet-Drafts are draft documents valid for a maximum =
of six=20
months and may be updated, replaced, or obsoleted by =
other=20
documents at any time. It is inappropriate to use =
Internet-=20
Drafts as reference material or to cite them other than =
as "work=20
in progress."=20
The list of current Internet-Drafts can be accessed at=20
http://www.ietf.org/ietf/1id-abstracts.txt=20
The list of Internet-Draft Shadow Directories can be =
accessed at=20
http://www.ietf.org/shadow.html.=20
Comments on this draft should be sent to the mailing list =
<ietf-
languages@iana.org>=20
Abstract=20
This document defines a "Content-language:" header, for use =
in the case=20
where one desires to indicate the language of something that =
has RFC-
822-like headers, like MIME body parts or Web documents, and =
an=20
"Accept-Language:" header for use in the case where one =
wishes to=20
indicate one's preferences with regard to languages.=20
1. Introduction=20
=0C
Content Language Headers Harald =
Alvestrand=20
draft-alvestrand-content-language-03.txt Expires =
August 2002=20
=20
=20
There are a number of languages presently or previously used =
by human=20
beings in this world.=20
A great number of these people would prefer to have =
information=20
presented in a language which they understand.=20
In some contexts, it is possible to have information =
available in more=20
than one language, or it might be possible to provide tools =
(such as=20
dictionaries) to assist in the understanding of a language.=20
In other cases, it may be desirable to use a computer =
program to=20
convert information from one format (such as plaintext) into =
another=20
(such as computer-synthesized speech, or Braille, or =
high-quality print=20
renderings).=20
=20
A prerequisite for any such function is a means of labelling =
the=20
information content with an identifier for the language that =
is used in=20
this information content, such as is defined by [TAGS].=20
This document specifies a protocol element for use with =
protocols that=20
use RFC-822 like headers for carrying language tags as =
defined in=20
[TAGS].=20
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL =
NOT",=20
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and =
"OPTIONAL" in this=20
document are to be interpreted as described in [RFC 2119].=20
2. The Content-language header=20
The "Content-Language" header is intended for use in the =
case where one=20
desires to indicate the language(s) of something that has =
RFC-822-like=20
headers, such as MIME body parts or Web documents.=20
The RFC-822 EBNF of the Content-Language header is:=20
Content-Language =3D "Content-Language" ":" 1#Language-tag=20
=20
In the more strict RFC 2234 ABNF:=20
=20
Content-Language =3D "Content-Language" ":" [CFWS] =
Language-List=20
Language-List =3D Language-Tag [CFWS] *("," [CFWS] =
Language-Tag [CFWS])=20
=20
The Content-Language header may list several languages in a =
comma-
separated list.=20
The CFWS construct is intended to function like the =
whitespace=20
convention in RFC 822, which means also that one can place=20
parenthesized comments anywhere in the language sequence, or =
use=20
continuation lines. A formal definition is given in RFC 2822 =
[RFC2822].=20
In keeping with the tradition of RFC 2822, a more liberal =
"obsolete"=20
grammar is also given:=20
obs-content-language =3D "Content-Language" *WSP ":" [CFWS] =
Language-List=20
=20
=20
draft-alvestrand-content-language-03.txt =
[Page 2] =0C
Content Language Headers Harald =
Alvestrand=20
draft-alvestrand-content-language-03.txt Expires =
August 2002=20
=20
=20
Like RFC 2822, this specification says that conforming =
implementations=20
MUST accept the obs-content-language syntax, but MUST NOT =
generate it;=20
all generated headers MUST conform to the Content-Language =
syntax.=20
=20
2.1 Examples of Content-language values=20
=20
=20
Voice recording from Liverpool downtown=20
Content-type: audio/basic=20
Content-Language: en-scouse=20
=20
Document in Mingo, an American Indian language which does =
not have an=20
ISO 639 code:=20
Content-type: text/plain=20
Content-Language: i-mingo=20
=20
An English-French dictionary=20
=20
Content-type: application/dictionary=20
Content-Language: en, fr (This is a dictionary)=20
=20
An official European Commission document (in a few of its =
official=20
languages)=20
=20
Content-type: multipart/alternative=20
Content-Language: da, de, el, en, fr, it=20
=20
An excerpt from Star Trek=20
Content-type: video/mpeg=20
Content-Language: i-klingon=20
=20
3. The Accept-Language header=20
The "Accept-Language" header is intended for use in the case =
where a=20
user or a process desires to identify the preferred =
language(s) when=20
RFC-822-like headers, such as MIME body parts or Web =
documents are=20
used.=20
The RFC-822 EBNF of the Accept-Language header is:=20
Accept-Language =3D "Accept-Language" ":"=20
1#( language-range [ ";" "q" "=3D" =
qvalue ] )=20
=20
A slightly more restrictive RFC-2234 ABNF definition is:=20
=20
=20
draft-alvestrand-content-language-03.txt =
[Page 3] =0C
Content Language Headers Harald =
Alvestrand=20
draft-alvestrand-content-language-03.txt Expires =
August 2002=20
=20
=20
Accept-Language =3D "Accept-Language:" [CFWS] language-q *( =
"," [CFWS]=20
language-q )=20
language-q =3D language-range [";" [CFWS] "q=3D" qvalue ] =
[CFWS] =20
qvalue =3D ( "0" [ "." 0*3DIGIT ] )=20
/ ( "1" [ "." 0*3("0") ] )=20
=20
A more liberal RFC-2234 ABNF definition is:=20
=20
Obs-accept-language =3D "Accept-Language" *WSP ":" [CFWS] =
obs-language-q=20
*( "," [CFWS] obs-language-q ) [CFWS]=20
obs-language-q =3D language-range [ [CFWS] ";" [CFWS] "q" =
[CFWS] "=3D"=20
qvalue ]=20
=20
Like RFC 2822, this specification says that conforming =
implementations=20
MUST accept the obs-accept-language syntax, but MUST NOT =
generate it;=20
all generated messages MUST conform to the Accept-Language =
syntax.=20
=20
The syntax and semantics of language-range is defined in =
[TAGS].=20
(Note that RFC-822 EBNF rather than ABNF is used here, in =
order to=20
ensure that the syntax is identical with that specified in =
[RFC 2616]).=20
The Accept-Language header may list several language-ranges =
in a comma-
separated list, and each may include a quality value Q.=20
If no Q values are given, the language-ranges are given in =
priority=20
order, with the leftmost language-range being the most =
preferred=20
language; this is an extension to the HTTP/1.1 rules, but =
matches=20
current practice.=20
If Q values are given, refer to HTTP/1.1 [RFC 2616] for the =
details on=20
how to evaluate it.=20
4. Security Considerations=20
The only security issue that has been raised with language =
tags since=20
the publication of RFC 1766, which stated that "Security =
issues are=20
believed to be irrelevant to this memo", is a concern with =
language=20
ranges used in content negotiation - that they may be used =
to infer the=20
nationality of the sender, and thus identify potential =
targets for=20
surveilllance.=20
This is a special case of the general problem that anything =
you send is=20
visible to the receiving party; it is useful to be aware =
that such=20
concerns can exist in some cases.=20
The exact magnitude of the threat, and any possible =
countermeasures, is=20
left to each application protocol.=20
5. Character set considerations=20
This document adds no new considerations beyond what is =
mentioned in=20
[TAGS].=20
=20
draft-alvestrand-content-language-03.txt =
[Page 4] =0C
Content Language Headers Harald =
Alvestrand=20
draft-alvestrand-content-language-03.txt Expires =
August 2002=20
=20
=20
6. Acknowledgements=20
This document has benefited from many rounds of review and =
comments in=20
various fora of the IETF and the Internet working groups.=20
Any list of contributors is bound to be incomplete; please =
regard the=20
following as only a selection from the group of people who =
have=20
contributed to make this document what it is today.=20
In alphabetical order:=20
Tim Berners-Lee, Nathaniel Borenstein, Sean M. Burke, John =
Clews, Jim=20
Conklin, John Cowan, Dave Crocker, Martin Duerst, Michael =
Everson, Ned=20
Freed, Tim Goodwin, Dirk-Willem van Gulik, Marion Gunn, Paul =
Hoffman,=20
Olle Jarnefors, John Klensin, Bruce Lilly, Keith Moore, =
Chris Newman,=20
Masataka Ohta, Keld Jorn Simonsen, Rhys Weatherley, Misha =
Wolf,=20
Francois Yergeau and many, many others.=20
=20
Special thanks must go to Michael Everson, who has served as =
language=20
tag reviewer for almost the complete period since the =
publication of=20
RFC 1766, and has provided a great deal of input to this =
revision.=20
Bruce Lilly did a special job of reading and commenting on =
my ABNF=20
definitions.=20
7. Author's Address=20
Harald Tveit Alvestrand=20
Cisco Systems=20
Weidemanns vei 27=20
7043 Trondheim=20
NORWAY=20
EMail: Harald@Alvestrand.no=20
Phone: +47 73 50 33 52=20
8. References=20
=20
[TAGS] Alvestrand, H., "Tags for the identification of =
languages",=20
RFC 3066=20
[ISO 639]=20
ISO 639:1988 (E/F) - Code for the representation of =
names of=20
languages - The International Organization for =
Standardization,=20
1st edition, 1988-04-01 Prepared by ISO/TC 37 - =
Terminology=20
(principles and coordination).=20
Note that a new version (ISO 639-1:2000) is in =
preparation at the=20
time of this writing.=20
[ISO 639-2]=20
=20
draft-alvestrand-content-language-03.txt =
[Page 5] =0C
Content Language Headers Harald =
Alvestrand=20
draft-alvestrand-content-language-03.txt Expires =
August 2002=20
=20
=20
ISO 639-2:1998 - Codes for the representation of names =
of=20
languages -- Part 2: Alpha-3 code - edition 1, =
1998-11-01, 66=20
pages, prepared by ISO/TC 37/SC 2=20
=20
[ISO 3166]=20
ISO 3166:1988 (E/F) - Codes for the representation of =
names of=20
countries - The International Organization for =
Standardization,=20
3rd edition, 1988-08-15.=20
[ISO 15924]=20
ISO/DIS 15924 - Codes for the representation of names =
of scripts=20
(under development by ISO TC46/SC2) =20
[RFC 1521]=20
Borenstein, N., and N. Freed, "MIME Part One: =
Mechanisms for=20
Specifying and Describing the Format of Internet =
Message Bodies",=20
RFC 1521, Bellcore, Innosoft, September 1993.=20
[RFC 2119]=20
Key words for use in RFCs to Indicate Requirement =
Levels. S.=20
Bradner. March 1997.=20
[RFC 2234]=20
Augmented BNF for Syntax Specifications: ABNF. D. =
Crocker, Ed., P.=20
Overell, November 1997.=20
[RFC 2616]=20
Hypertext Transfer Protocol -- HTTP/1.1. R. Fielding, =
J. Gettys, =20
J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. =
Berners-Lee. June=20
1999.=20
[RFC 2822]=20
Internet Message Format. P. Resnick, Editor. April =
2001.=20
Appendix A: Changes from RFC 1766=20
The definition of the language tags has been split, and is =
now RFC 3066=20
The differences parameter to multipart/alternative is no =
longer part of=20
this standard, because no implementations of the function =
were ever=20
found. Consult RFC 1766 if you need the information.=20
The ABNF for content-language has been updated to use the =
RFC 2234=20
ABNF.=20
=20
draft-alvestrand-content-language-03.txt =
[Page 6] =0C
--==========325820379==========--