IDNA comments
Frank Ellermann
hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com
Mon Jul 7 23:47:13 CEST 2008
Mark Davis wrote:
> "The string is converted from the local character set into Unicode,
> if it is not already Unicode.
I'd strike "if it is not already Unicode", that this conversion can
be a noop isn't the interesting point...
> The exact nature of this conversion is beyond the scope of this
> document, but may involve normalization, as described in Section
> 4.2."
s/may involve/involves/, that is the really interesting part, also
applicable for Unicode input.
> "IDNA uses the Unicode character repertoire, which avoids the
> significant delays that would be inherent in waiting for a
> different and specific character sets to be defined for IDN
> purposes, presumably by some other standards developing
> organization. " This is a very strange rationale
It is a summary of RFC 5242, but likely readers don't get this
subtle point. Maybe it could be trimmed to "because no other
alternative for the IDNA purposes exists".
> "Applications MAY allow the display and user input of A-labels,
> but are encouraged to not do so except as an interface for
> special purposes, possibly for debugging, or to cope with
> display limitations." There is widespread use of the A-Label
> to signal a possible spoof -- while you discuss that later,
> I think it's swimming against the tide not to mention it here.
Strong ACK, an application showing me U-labels in any script I
can't read without my prior consent is broken. s/MAY/MUST/ or
similar. Sometimes applications have no idea which code points
can be displayed at all, and we should not invent new ?-attacks.
> "the two-character sequence "ae" is usually treated as a fully
> acceptable alternate orthography." Add: "for the "umlauted a"
> character".
s/usually/under certain conditions/, e.g., in US-ASCII RFCs,
s/a fully accetable/an exceptionally acceptable/
> Add, to show that we not playing favorites, "Even the very
> common words in English like "can't, and "don't" are not allowed.
...won't work for me, I'm not sure that "don't" is a "word".
Of course English is favoured, pretending that it's not is a
waste of everybody's time. And it starts to get surreal if
folks say "Roman alphabet" when they mean "US-ASCII letters"
for political correctness, nothing is wrong with "w" and "u".
Frank
More information about the Idna-update
mailing list