IDNA, IRI, HTML5 coordination
masinter at adobe.com
Wed Sep 16 17:43:17 CEST 2009
Goal: bring together and coordinate the definitions
of what is used for resource identification in the web and elsewhere
(IRIs as the evolution of URL, URI, IRI, HREF, Web Address, etc.)
within W3C, IETF and their specifications. See "design goals" below.
Goal of this message: lay out the concerned groups, start discussion
of process for coordination.
I've bcc'd everyone except the public-iri at w3.org mailing list,
archive http://lists.w3.org/Archives/Public/public-iri/ as the
list proposed for discussion:
My suggestion for how to get all of these groups to coordinate
is to start an IETF working group with a charter to bring these
specifications into alignment. I can't think of any other process
which can accomplish the goal.
PLEASE, PLEASE: if you're going to post an opinion, please at least
cc public-iri at w3.org and try to keep discussion there.
PLEASE: Separate 'process' issues (should there be a working group?
Who else needs to be involved? What's the timing and when?) from
(Incomplete) list of specifications, groups, chairs, editors:
[HTTPBIS-URI] HTTP URI scheme def in HTTPBIS draft:
[HTTP-RFC] current HTTP URI scheme definition
in RFC 2616 http://tools.ietf.org/html/rfc2616#section-3.2.2
[HTTPBIS-WG] IETF HTTPBIS working group
mailing list: ietf-http-wg at w3.org,
chair: Mark Nottingham <mnot at mnot.net>
editors: Roy Fielding <fielding at gbiv.com>,
Julian Reschke <julian.reschke at greenbytes.de>, (others)
[IDNABIS-*] definitions, policies, standards for how Internationalized
Domain Names should be handled in Internet applications
[IDNABIS-WG] IETF IDNABIS working group
chair: Vint Cerf <vint at google.com>
editor: John C Klensin <klensin at jck.com>
[IRIBIS-6] Revision under preparation:
[IRIBIS-LMM] ("Experimental" draft attempting to satisfy IDNABIS and HTML requirements)
discussion on: public-iri at w3.org (among others)
(other)editors: Martin Dürst <duerst at it.aoyama.ac.jp>
Michel SUIGNARD <Michel at suignard.com>
[MAILTO-RFC] Mailto: URI scheme
[MAILTO-BIS] In preparation
(other) editors (including) Martin Dürst (duerst at it.aoyama.ac.jp)
discussion on: uri at w3.org
[URI-RFC] URI spec
mailing list: uri at w3.org
(other) editors: Roy Fielding <fielding at gbiv.com>, Tim Berners-Lee <timbl at w3.org>
URI guidelines: policies and procedures for registering new URI schemes
editors: Tony Hansen <tony at att.com>
mailing list for URI review: uri-review at ietf.org
[HTML5-CURRENT] HTML5 definition of "URLs"
[WEBADDRESS] Attempt to split out "Web Address" component:
[HTML-WG] W3C Working Group
URL/IRI issue: http://www.w3.org/html/wg/tracker/issues/56
chairs: Paul Cotton <paul.cotton at microsoft.com>
Maciej Stachowiak <mjs at apple.com>
Sam Ruby <rubys at intertwingly.net>
editor: Ian Hickson <ian at hixie.ch>
Other interested groups:
IETF Applications area
mailing list: apps-discuss at ietf.org
area directors: Lisa Dusseault <lisa.dusseault at gmail.com>;
Alexey Melnikov <alexey.melnikov at isode.com>
W3C TAG (architectural issue around URIs in W3C specs)
mailing list: www-tag at w3.org
chair: Noah Mendelsohn <noah_mendelsohn at us.ibm.com>
(Have I missed any groups, specs? I'll update this list
and set it up somewhere)
Some design goals:
I’ve tried to write down some of the design goals which I think are important; these may be in conflict, but I've tried to propose priorities which make sense to me. Does anyone disagree with any of these? Think some are missing?
Consistent Terminology: Multiple definitions of the same terms in different documents are to be avoided; even consistent definitions are problematic. Where possible, newer documents should reference older specs.
Security: Avoiding security problems (e.g., difficulties due to spoofing, renaming, misuse of DNS) is a high priority; avoiding security problems is a higher priority than being consistent with existing applications.
Uniform behavior: Optional interpretation rules for resource identifiers which give different results depending on the processing model chosen are to be avoided.
Consistency of web and other Internet applications: Interoperability between web applications (browsers, proxies, spiders, etc.) and other Internet applications which use resource identifiers (email, directory services) is important, and should be given equal (or nearly equal) priority as interoperability between web browsers. Recommended practice for web applications and other Internet applications should be the same – those creating web content should not be encouraged to create Resource Identifiers (whether called URLs, URIs, IRIs, Web Addresses) which would not function in other applications.
Consistency of specifications with implementations: When existing specifications do not match the common practice of existing applications, it is appropriate to update the existing specification, even if long standing.
Improve interoperability: When existing implementations disagree, document existing practice, but recommend (normatively) the behavior that will best lead to improved interoperability.
Separate “specification of what a conservative producer should send” from “advice for what a liberal consumer should accept”: for robustness, the specification of a “conforming” resource identifier should produce can be (if necessary) more restrictive than the specification of what some common applications accept.
Minimize options and specifications: The split between URI and IRI as separate protocol elements was unfortunate – to have two separate normative terms, “URI” and “IRI” to describe two variations of “resource identifiers”, but having unnecessary multiple non-terminals and terms is harmful. Adding additional terms such as “LEIRI” and “Web Address” or HREF should be avoided, if at all possible.. (In some ways, “URI” was the term used to unify “URL” and “URN”).
Unless necessary for other reasons above, avoid making existing, conforming, and widely implemented behavior non-conforming: Applications which accept URIs but not IRIs should not be made “non-conforming” by a redefinition of terms.
Some issues (I'm sure there are many more)
• Can IRI -> URI transformation be scheme independent? ((optional processing would allow
non-uniform behavior, not meet IDNA requirements))
• Use of term “URL” ((ambiguous terms))
• handling of extra processing rules for XML vs HTML5 vs. IRI document
• Whether HTML5 references anything other than IRIbis
• Updating the URI scheme registry to be clear that "URI schemes" are the same as “URL schemes”
and “IRI schemes”
• Can different URI schemes allow different I18N processing rules?
More information about the Idna-update