HTML requires URIs (Re: IDNAbis compatibility)
Harald Alvestrand
harald at alvestrand.no
Wed Apr 4 16:31:30 CEST 2007
Mark Davis wrote:
>
>
> On 3/15/07, *John C Klensin* <klensin at jck.com
> <mailto:klensin at jck.com>> wrote:
>
> [snip]
>
> I'm trying to understand this experiment. Normally, an href
> that "uses IDNA" would have Punycode labels (A-labels) in its
> domain names.
>
>
> I don't know the basis for saying that this would be the "normal"
> usage. There isn't anything in IDNA2003, unless I'm missing something,
> that requires or even suggests that it is not perfectly fine to have:
>
> <a href="http://ÖBB.at <http://%C3%96BB.at>">Österreichishe Bundesbahn</a>
That text is in the HTML spec.
All versions of HTML that claim that the stuff inside a "href=" is an
URI (rather than an IRI) implicitly claim that the domain name is in
A-label form.
I'm not up to date on HTML updates, but RFC 3987 was published in
January 2005, so all versions older than that (including HTML 4.01)
referenced URIs.
After a little searching, I found that details on the recommended way of
handling those errors are in
<http://www.w3.org/TR/html401/appendix/notes.html#non-ascii-chars> - but
they're still errors.
Do you have stats on how many of the 831.000 cases you identified were
in A-label form rather than "possibly conformant U-label" form? That
would tell us something about how much standards are adhered to....
Harald
More information about the Idna-update
mailing list