HTML requires URIs (Re: IDNAbis compatibility)

Harald Alvestrand harald at alvestrand.no
Wed Apr 4 16:31:30 CEST 2007


Mark Davis wrote:
>
>
> On 3/15/07, *John C Klensin* <klensin at jck.com 
> <mailto:klensin at jck.com>> wrote:
>
> [snip]
>
>     I'm trying to understand this experiment.  Normally, an href
>     that "uses IDNA" would have Punycode labels (A-labels) in its
>     domain names. 
>
>
> I don't know the basis for saying that this would be the "normal" 
> usage. There isn't anything in IDNA2003, unless I'm missing something, 
> that requires or even suggests that it is not perfectly fine to have:
>
> <a href="http://ÖBB.at <http://%C3%96BB.at>">Österreichishe Bundesbahn</a>
That text is in the HTML spec.

All versions of HTML that claim that the stuff inside a "href=" is an 
URI (rather than an IRI) implicitly claim that the domain name is in 
A-label form.
I'm not up to date on HTML updates, but RFC 3987 was published in 
January 2005, so all versions older than that (including HTML 4.01) 
referenced URIs.

After a little searching, I found that details on the recommended way of 
handling those errors are in 
<http://www.w3.org/TR/html401/appendix/notes.html#non-ascii-chars> - but 
they're still errors.

Do you have stats on how many of the 831.000 cases you identified were 
in A-label form rather than "possibly conformant U-label" form? That 
would tell us something about how much standards are adhered to....

                      Harald




More information about the Idna-update mailing list