Unicode 7.0.0, (combining) Hamza Above, and normalization
asmusf at ix.netcom.com
Sat Aug 9 02:49:04 CEST 2014
On 8/8/2014 11:35 AM, Shawn Steele wrote:
>> Computers are dumb, homographs are confusing for all the reasons we know, so our least bad solution is to forbid them even in places where they'd be linguistically harmless.
> (you confused me with "linguistically harmless". I read that as not damaging my language, yet forbidding characters is sort of linguistically damaging by definition).
First, To augment that.
The draft not only proposes to outlaw correctly encoded spelling, but
suggests that incorrectly encoded string should be used as workaround.
Irrespective of whether users have access to that string on their keyboards.
This is rather different from a fallback spelling, such as using "ss"
for "ß", or many similar cases. Established fallback spellings typically
use more basic letters, and often have an established history in the
user community. They are thus less "linguistically damaging" than the
case under discussion.
Second, I note that, implicit, in the wording "least bad" is the
acknowledgement that there is in fact a range of options. I am far from
convinced that "least bad" is the correct evaluation. Certainly it's the
"most restrictive" solution, but that alone doesn't make the most
Third, the degree of "context" available in a domain label is not fixed
at zero. That may be true for the root, but not for all other zones. For
zones where the labels are expected to be in the Fula language, a
blanket prohibition of the new code point is not necessarily "less bad"
than allowing a zone specific policy of prohibiting the other
More on that in a separate post.
More information about the Idna-update