looking up domain names with unassigned code points
Shawn.Steele at microsoft.com
Mon May 12 21:40:36 CEST 2008
> There is no way to require a client to do that (display Unicode form).
I wasn't very clear :). IE is "training" users that xn-- forms are suspect in the display and shouldn't be trusted. If IE was trying to display an xn-- name that was in the user's script with "new" characters, then IE would try to display it in Unicode, if it could resolve the name (which it can't for new characters). If the intent is for "new" characters to be used without being discriminated against, then conversion to Unicode is necessary.
FWIW: In practice we're discussing the difference between Unicode 3.2 and Unicode 5 characters. IE "knows" the scripts and character properties of the newer characters, so its phishing logic will still behave as expected. It doesn't know how to handle IDN labels though that are greater than Unicode 3.2. Were IDN more current, then the disparity wouldn't be very visible. If the Unicode set of IDN and IE were similar, then only "new" characters would fall into this bucket (like Unicode 6 or whatever), and then IE would start failing the phishing stuff since it couldn't find Unicode data and script data for the Unicode 6 code points, regardless of how it handled IDN names.
More information about the Idna-update