Implementation questions (digressing from...)

Shawn Steele Shawn.Steele at microsoft.com
Wed Dec 24 03:26:10 CET 2008


Oh, and if we could figure out a way to provide a preferred "display" name, that might help resolve the issues with the native digits display problem.

- shawn

-----Original Message-----
From: Shawn Steele
Sent: Dienstag, 23. Dezember 2008 18:20
To: 'idna-update at alvestrand.no'
Cc: 'erikv at google.com'
Subject: Re: Implementation questions (digressing from...)

I think that if the A-Label is in the HTML, then the browser would need to be able to round-trip it through Unicode.  Additionally, it seems that if an HTML author could make it look like ß, then it would be likely that a user would enter a ß, so I don't think that'd help much.

--

I'm not a DNS expert, but I can think of a couple ways to return a "display" name, although I don't think they're great.  The PTR record would be obvious, and a CNAME could point to an A record of a Punycode siplay form.  The problem is that you end up returning (or resolving) an effectively illegal strict IDNA2008 name, and expect someone else to use the same mapping to resolve it.  There might be better ways but they'll still have the mapping problem since that's the issue that's causing the problem.

For ASCII domains, case is preserved, so ASCII domains can get this behavior.  Punycode also allows preserving a casing bit, but round tripping is problematic for things like the eszett, and other mappings.

- Shawn


Excerpted from Erik's Mail:

> From a browser perspective, it seems that if I encountered an
> eszett I'd have to use the 2008 rules, and if those don't succeed,
> fall back to the 2003 rules.

There is another alternative, and I wonder what you think about it. If
a browser will continue to "pre-process" even after adopting IDNA2008,
then eszett is always mapped to ss and the only way for an HTML author
to actually include an eszett is to insert it into a U-label and then
compute the IDNA2008 A-label from it, inserting that A-label into the
HTML. One rather important advantage of this approach is that neither
browsers nor crawlers need to ever do two DNS lookups for a single
domain name. Of course, one of the disadvantages is that one cannot
have a raw eszett in the HTML and expect the browser to refrain from
mapping to ss. At the moment, my opinion is that the stated advantage
outweighs the stated disadvantage. (But there may be other advantages
and disadvantages.)

> Just to randomize the conversation:  I can see where ? and ss
> can differ linguistically, but in practice I can't see how they can
> resolve to different domains.  <CrazyIdea>So it seems like I (as
> a domain owner) need the distinction primarily for display, not
> for resolution.  In other words: how about allowing PTR records
> or maybe a special CNAME or something that resolves a name
> to its preferred display form, undoing any mappings that were
> encoded?</CrazyIdea>

I kinda like where this might be headed. I would hope that the display
preference could be returned together with the IP address, and that it
would use a mechanism that is guaranteed to be passed all the way back
to the client. Would either of your suggestions (PTR, CNAME) provide
this?

Erik



More information about the Idna-update mailing list