Preparation for IDNABIS Stockholm

Vint Cerf vint at google.com
Sat Jul 18 21:02:41 CEST 2009


Folks,



We must achieve consensus on our documents and submit them to the AD  
at or shortly after our WG meeting in Stockholm.



here is where I think we are:



We have essential consensus on the five primary documents (rationale,  
defs, bidi and protocol). There are ongoing discussions about some  
specifics in Tables. We are also discussing the language of the sixth  
document (mapping).  Bidi and Tables have expired as I-Ds and need to  
be re-issued.



"Internationalized Domain Names for Applications (IDNA): Background,

  Explanation, and Rationale", John Klensin, 18-Jun-09,

  <draft-ietf-idnabis-rationale-10.txt>



  "Internationalized Domain Names in Applications (IDNA): Protocol",  
John

  Klensin, 13-Jul-09, <draft-ietf-idnabis-protocol-13.txt>



  "Internationalized Domain Names for Applications (IDNA): Definitions  
and

  Document Framework", John Klensin, 22-Jun-09,

  <draft-ietf-idnabis-defs-09.txt>



"An updated IDNA criterion for right-to-left scripts", H. Alvestrand,  
C. Karp, 30-Nov-2008

  < draft-ietf-idnabis-bidi-03.txt>



"The Unicode code points and IDNA", Patrik Faltstrom, 22-Dec-2008,

  <draft-ietf-idnabis-tables-05.txt>





  "Mapping Characters in IDNA", Pete Resnick, Paul Hoffman, 3-Jul-09,

  <draft-ietf-idnabis-mappings-01.txt>







Assuming we want to make reference to the mapping document in  
rationale (and perhaps in protocol), we need to conclude how to do that.



We also need to modify the mapping document to perform the sequence of  
mappings in an order that puts NFC mapping last and yet still reduce  
user surprise. Editors Hoffman and Resnick have this responsibility  
after the WG decides on which steps in which order are to be used.



We have already concluded (cf at IETF 74 and subsequently) that the  
strings that are actually registered must be in U-Label form (or A- 
Label form or both, assuming equivalence under the bidirectional  
punycoding mechanism). How ever these strings are arrived at, the  
registrant should have a clear understanding of the actual domain  
names that will be registered. I think this implies that there should  
not be implicit mappings undertaken by the zone registrar (using the  
term zone here in its most general sense, not just at TLD or SLD  
levels) that might make it ambiguous exactly what domain name will be  
found in the name server.



The procedure outlined in the mapping document is intended to provide  
a means of reducing user surprise in part by emulating prior to look  
up the behavior case insensitive behavior of the pure ASCII Domain  
Name regime of the past. It is also intended to assure that the labels  
used in the lookup process have characters expressed in Unicode Normal  
Form C.



I would like to propose the following ideas for discussion:



1. Mapping should NOT be a MUST on lookup, to allow for the fact that  
a substantial range of transformations may take place from the time a  
possible DNS reference is input into an application to the point where  
a DNS query is made. This suggests that RFC 2119 conformant language  
could read “mapping MAY be performed on lookup.”



2. There has been discussion whether the mapping document should be  
silent with regard to RFC 2119 language. There is a range of opinions.  
Unless the WG comes to consensus on wording that meets the RFC 2119  
requirements, the document will remain the same. There remains a  
question of how the mapping document should be referred from the  
protocol document. The WG needs to come to consensus on that, applying  
the same requirements on meeting the RFC 2119 requirements.



3. We need to finalize the Tables document. The WG mailing list has  
been very active with discussions on this point. We must come to  
closure at this meeting on the content of Tables. It would be  
appreciated and the chair will attempt to enforce that we do not re- 
open matters upon which  the WG has already reached consensus.



Specifically, Eszett(sharp S) is PVALID. ZWJ and ZWNJ are CONTEXTJ,  
TATWHEEL is DISALLOWED, Hangul JAMO are DISALLOWED, Final Sigma is  
PVALID.



Regarding Tables, Mark Davis appears to have captured section F  
(exceptions) as follows:





PVALID: // would otherwise have been DISALLOWED



  00DF; PVALID     # LATIN SMALL LETTER SHARP S

  03C2; PVALID     # GREEK SMALL LETTER FINAL SIGMA

  06FD; PVALID     # ARABIC SIGN SINDHI AMPERSAND

  06FE; PVALID     # ARABIC SIGN SINDHI POSTPOSITION MEN

  0F0B; PVALID     # TIBETAN MARK INTERSYLLABIC TSHEG

  3007; PVALID     # IDEOGRAPHIC NUMBER ZERO



CONTEXTO: // would otherwise have been DISALLOWED



  00B7; CONTEXTO   # MIDDLE DOT

  0375; CONTEXTO   # GREEK LOWER NUMERAL SIGN (KERAIA)

  05F3; CONTEXTO   # HEBREW PUNCTUATION GERESH

  05F4; CONTEXTO   # HEBREW PUNCTUATION GERSHAYIM

  30FB; CONTEXTO   # KATAKANA MIDDLE DOT



CONTEXTO: // would otherwise have been PVALID



  U+002D; CONTEXTO   # HYPHEN-MINUS

  U+02B9; CONTEXTO   # MODIFIER LETTER PRIME

  U+0660; CONTEXTO   # ARABIC-INDIC DIGIT ZERO

  U+0661; CONTEXTO   # ARABIC-INDIC DIGIT ONE

  U+0662; CONTEXTO   # ARABIC-INDIC DIGIT TWO

  U+0663; CONTEXTO   # ARABIC-INDIC DIGIT THREE

  U+0664; CONTEXTO   # ARABIC-INDIC DIGIT FOUR

  U+0665; CONTEXTO   # ARABIC-INDIC DIGIT FIVE

  U+0666; CONTEXTO   # ARABIC-INDIC DIGIT SIX

  U+0667; CONTEXTO   # ARABIC-INDIC DIGIT SEVEN

  U+0668; CONTEXTO   # ARABIC-INDIC DIGIT EIGHT

  U+0669; CONTEXTO   # ARABIC-INDIC DIGIT NINE

  U+06F0; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT ZERO

  U+06F1; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT ONE

  U+06F2; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT TWO

  U+06F3; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT THREE

  U+06F4; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT FOUR

  U+06F5; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT FIVE

  U+06F6; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT SIX

  U+06F7; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT SEVEN

  U+06F8; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT EIGHT

  U+06F9; CONTEXTO   # EXTENDED ARABIC-INDIC DIGIT NINE

  U+0483; CONTEXTO   # COMBINING CYRILLIC TITLO

  U+3005; CONTEXTO   # IDEOGRAPHIC ITERATION MARK



DISALLOWED: // would otherwise have been PVALID



  U+302E; DISALLOWED # HANGUL SINGLE DOT TONE MARK

  U+302F; DISALLOWED # HANGUL DOUBLE DOT TONE MARK



In addition it has been proposed to DISALLOW the following vertical  
formatting characters:



U+3031: Lm: VERTICAL KANA REPEAT MARK

U+3032: Lm: VERTICAL KANA REPEAT WITH VOICED SOUND MARK

U+3033: Lm: VERTICAL KANA REPEAT MARK UPPER HALF

U+3034: Lm: VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF

U+3035: Lm: VERTICAL KANA REPEAT MARK LOWER HALF

U+303B: Lm: VERTICAL IDEOGRAPHIC ITERATION MARK

U+07FA: Lm:  NKO LAJANYALAN

I propose that we come prepared to resolve all of these matters on the  
first day of the IETF meeting (Monday). If there are other matters  
that WG members believe need to be addressed, please speak up!



See you in Stockholm!



Vint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090718/b266a6e9/attachment-0002.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: idna ietf 75 agenda.doc
Type: application/octet-stream
Size: 51200 bytes
Desc: not available
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20090718/b266a6e9/attachment-0001.obj 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090718/b266a6e9/attachment-0003.htm 


More information about the Idna-update mailing list