Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft)
JFC Morfin
jefsey at jefsey.com
Tue Jan 22 23:36:34 CET 2008
At 19:59 22/01/2008, John C Klensin wrote:
>When we can avoid it, I find it helpful to avoid thinking about
>and debating individual characters. Instead, let's focus on
>principles,
Dear John,
your analysis seems to be correct but on one point that Michael
pointed out. You talk of "characters" but do not define what a
"character" is. It seems it can be:
- either a visusal item (Michael)
- either a registered DNS item (you)
- or a Unicode point (IDNA).
If you do not say:
- what a character is,
- at what layer language (and therefore semantic) issues are dealt with,
we will stay with confusion, and different forms of layer violation
depending on who speaks.
As far I am concerned:
1. "characters" are a set of visual graphics that are registered in
the same DNS way.
- The way they are displayed as initial, middle, last character,
in upper, small upper or lower case is irrelevant.
- The script they belong to is irrelevant.
2. language related issues are semantic and do not belong to the
layer of IETF responsibility. However, nothing must prevent them to
be restored at application layer, so the differences made by Michael
can be respected (Words is able to restore upper case at the begining
of a sentence, etc.). IETF does not deal with artists, graphists,
lawyers, etc. but with computers which in turn deal with them.
3. because ccTLD tables can include characters using the same sign as
others tables, they are a working basis, but the semiologic sign code
is not their concatenation (we would meet the same problem as with Unicode).
4. there a possibility to retain most of Unicode at the price of
complexity. It is to use classes (which can be IDNA classes), to be
identified in a way or another, whith class local rules. This makes
IDNA more complex, but possibly faster to implement.
5. every solution must be fool/phishing proof at every DNS level.
This means that the way people/word processors write/print/display
the characters is orthogonal to domain name labels.
jfc
More information about the Idna-update
mailing list