looking up domain names with unassigned code points
Vint Cerf
vint at google.com
Sun May 11 16:36:17 CEST 2008
I think we should say nothing about display. John's focus is on
whether and how to do the lookup.
I agree with what I understand his two positions to be:
1. just put the punycode string into the DNS query opaquely.
OR
2. do the conversion and handle as if the resulting Unicode had been
submitted.
technical question:
if someone generates an arbitrary string of the form "xn-- <random
sequence of lowercase a-z, 0-9 and hyphen>
does the algorithm ALWAYS produce a sequence of UNICODE code points?
Note I did not say a PVALID set of code points or even ASSIGNED.
I am asking because I am wondering how a relatively simple-minded
implementation might look from the UI perspective.
If we always get a sequence of code points regardless of the sequence
of LDH, the simple-minded implementation could easily produce
gibberish if attempting to invert to UNICODE a sequence of random LDH
characters (confining the letters to lowercase)
Is the following correct:
let s be a random string of <lower case a-z, 0-9, hyphen> prefixed by
"xn--"
let To UNICODE be a function that maps s into UNICODE
let To ASCII be a function that maps UNICODE into punycode
s is valid punycode If and Only If s = To ASCII ( To UNICODE (s) )
I hope I haven't mangled the question too badly.
v
More information about the Idna-update
mailing list