Mapping other Digits to 0-9

Shawn Steele Shawn.Steele at microsoft.com
Fri Dec 5 19:56:31 CET 2008


> Those concerns, especially the former, lead me to believe that
> imposing a protocol restriction to ensure homogeneous digits in
> combination with Arabic script is probably a good idea

It isn't just the Arabic script.  We can replace 0-9 with "native digits" during rendering for any set of native digits, depending on user settings.

I could see where normalizing all digits to 0-9 would help lookup, but what about reverse lookup?  Presumably anyone using any native digits would actually want them to display in their domain name, however there's no way to programmatically figure out if 0-9 should be replaced with native digits for display in the scope of a URL.  Given the multilingual nature of the internet, even the rendering engine did decide that native digits might be appropriate, how would it know that it was picking the correct set of native digits?

Just brainstorming only two ideas come to mind:

* Allow people to register both variations (native & 0-9), and either expect the registrars to reserve both, or that people will pay twice for both to be registered.

* Force mapping all digits to 0-9, but provide a mechanism for round tripping back to the original script.  I can't think of a way to do this, other than allowing reverse DNS to return a different Punycode label than the input label, kind of like how you might ask for example.com, but DNS may return Example.Com.  (The bit used for remembering casing when generating the Punycode might be useful, but its only one bit and there're too many digit variations).

- Shawn



More information about the Idna-update mailing list