IDNA, IRI, HTML5 coordination

Larry Masinter masinter at
Wed Sep 16 17:43:17 CEST 2009

Goal: bring together and coordinate the definitions
of what is used for resource identification in the web and elsewhere
(IRIs as the evolution of URL, URI, IRI, HREF, Web Address, etc.)
within W3C, IETF and their specifications. See "design goals" below.

Goal of this message: lay out the concerned groups, start discussion
of process for coordination.

I've bcc'd everyone except the public-iri at mailing list,
archive as the
list proposed for discussion:

My suggestion for how to get all of these groups to coordinate
is to start an IETF working group with a charter to bring these
specifications into alignment. I can't think of any other process
which can accomplish the goal.

PLEASE, PLEASE: if you're going to post an opinion, please at least
cc public-iri at and try to keep discussion there.

PLEASE: Separate 'process' issues (should there be a working group?
Who else needs to be involved? What's the timing and when?) from
technical issues.



(Incomplete) list of specifications, groups, chairs, editors: 


[HTTPBIS-URI] HTTP URI scheme def in HTTPBIS draft:
[HTTP-RFC] current HTTP URI scheme definition
      in RFC 2616
    mailing list: ietf-http-wg at,
    chair:   Mark Nottingham <mnot at>
    editors: Roy Fielding <fielding at>,
             Julian Reschke <julian.reschke at>, (others)
[IDNABIS-*] definitions, policies, standards for how Internationalized 
      Domain Names should be handled in Internet applications 
[IDNABIS-WG]  IETF IDNABIS working group
      chair: Vint Cerf <vint at>
      editor: John C Klensin <klensin at>

[IRIBIS-6] Revision under preparation:
[IRIBIS-LMM] ("Experimental" draft attempting to satisfy IDNABIS and HTML requirements)
     discussion on: public-iri at (among others)
     (other)editors: Martin Dürst <duerst at>
                     Michel SUIGNARD <Michel at>

Mailto URI:

[MAILTO-RFC] Mailto: URI scheme
[MAILTO-BIS] In preparation
   (other) editors (including) Martin Dürst (duerst at
   discussion on: uri at


[URI-RFC] URI spec
   mailing list: uri at 
   (other) editors: Roy Fielding <fielding at>, Tim Berners-Lee <timbl at>
      URI guidelines: policies and procedures for registering new URI schemes
     editors: Tony Hansen <tony at>
     mailing list for URI review: uri-review at 

[HTML5-CURRENT]   HTML5 definition of "URLs"
[WEBADDRESS] Attempt to split out "Web Address" component:
[HTML-WG] W3C Working Group
     URL/IRI issue:
     chairs: Paul Cotton <paul.cotton at>
             Maciej Stachowiak <mjs at>
             Sam Ruby <rubys at>
     editor: Ian Hickson <ian at>

Other interested groups:

IETF Applications area
      mailing list: apps-discuss at
      area directors: Lisa Dusseault <lisa.dusseault at>; 
          Alexey Melnikov <alexey.melnikov at>

W3C TAG (architectural issue around URIs in W3C specs)
     mailing list: www-tag at
     chair: Noah Mendelsohn <noah_mendelsohn at>


(Have I missed any groups, specs? I'll update this list
and set it up somewhere)

Some design goals:

I’ve tried to write down some of the design goals which I think are important; these may be in conflict, but I've tried to propose priorities which make sense to me. Does anyone disagree with any of these? Think some are missing?

Consistent Terminology: Multiple definitions of the same terms in different documents are to be avoided; even consistent definitions are problematic. Where possible, newer documents should reference older specs.

Security: Avoiding security problems (e.g., difficulties due to spoofing, renaming, misuse of DNS) is a high priority; avoiding security problems is a higher priority than being consistent with existing applications.

Uniform behavior: Optional interpretation rules for resource identifiers which give different results depending on the processing model chosen are to be avoided.

Consistency of web and other Internet applications:  Interoperability between web applications (browsers, proxies, spiders, etc.) and other Internet applications which use resource identifiers (email, directory services) is important, and should be given equal (or nearly equal) priority as interoperability between web browsers. Recommended practice for web applications and other Internet applications should be the same – those creating web content should not be encouraged to create Resource Identifiers (whether called URLs, URIs, IRIs, Web Addresses) which would not function in other applications.

Consistency of specifications with implementations:  When existing specifications do not match the common practice of existing applications, it is appropriate to update the existing specification, even if long standing.

Improve interoperability: When existing implementations disagree, document existing practice, but recommend (normatively) the behavior that will best lead to improved interoperability.

Separate “specification of what a conservative producer should send” from “advice for what a liberal consumer should accept”: for robustness, the specification of a “conforming” resource identifier should produce can be (if necessary) more restrictive than the specification of what some common applications accept.

Minimize options and specifications: The split between URI and IRI as separate protocol elements was unfortunate – to have two separate normative terms, “URI” and “IRI” to describe two variations of “resource identifiers”, but having unnecessary multiple non-terminals and terms is harmful. Adding additional terms such as “LEIRI” and “Web Address” or HREF should be avoided, if at all possible.. (In some ways, “URI” was the term used to unify “URL” and “URN”).

Unless necessary for other reasons above, avoid making existing, conforming, and widely implemented behavior non-conforming: Applications which accept URIs but not IRIs should not be made “non-conforming” by a redefinition of terms.


Some issues (I'm sure there are many more)

•	Can IRI -> URI transformation be scheme independent? ((optional processing would allow
      non-uniform behavior, not meet IDNA requirements))
•	Use of term “URL”  ((ambiguous terms))
•	handling of extra processing rules for XML vs HTML5 vs. IRI document
•	Whether HTML5 references anything other than IRIbis
•	Updating the URI scheme registry to be clear that "URI schemes" are the same as “URL schemes” 
      and “IRI schemes”
•	Can different URI schemes allow different I18N processing rules?

More information about the Idna-update mailing list