M-Label or MVALID, and dangers with mappings?

Patrik Fältström patrik at frobbit.se
Sat Apr 11 08:26:34 CEST 2009


Let me just throw out some thoughts that are in my head. I just can  
not get rid of them.

1. Stability / Storage

I think it is good that we talk about M-Label, as we have been talking  
about A-Labels and U-Labels and we (specifically myself) *REALLY* like  
the fact A-Label and U-Label are defined terms. And that we have a 1:1  
mapping between them. That they are stable. It makes it possible to  
reference those explicitly in other specifications.

Now, I think as always that we always will have mappings. Applications  
will (and should) _ALWAYS_ try to "help" the user by trying to  
understand what the user want to "type". We have different keyboards,  
different input mechanisms etc, so mappings will exist on different  
abstraction layers. As Pete Resnick said some weeks ago.

Anyway, I think we have to say though that what is stored, regardless  
of where and how it is stored MUST be the A-label/U-label. This  
because I think the "further away" from the A-Label/U-Label we come in  
the abstraction, the more divergence we will get regarding support for  
mappings. This is btw where I have seen many problems with IDNA2003.  
Some people have in fact stored what has been possible to use as  
"input" to the IDNA algorithm, and not the "output".

Now you might say that if people did not read the IDNA2003 spec, why  
would they read the IDNA2008 spec, but I am not prepared on giving up  
due to that.

2. M-Label

I have seen the discussions regarding M-Label, and my say that as an  
editor of the tables document, I think it might be "more interesting"  
to define MVALID as a property that is calculated in the tables  
document.

Suggestion:

MVALID would be a codepoint that is mapped, according to the  
standardized mapping function, to something that is not DISALLOWED.

Next question would then be, what does the mapping function look like?

I have seen sort of the following suggestions, but I might have  
misunderstood this, so please help/my excuses for missing something:

1. Casefold (C+F)
2. Lowercase (C+S)
3. "IDNA2003 valid _input_ codepoints that are not mapped to  
DISALLOWED and themselves DISALLOWED"

Questions:

A. Should we defined MVALID?
B. What should the general rule be that define it (we can have  
codepoints in Exceptions as before)?

    Patrik

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20090411/3c2ce522/attachment.pgp 


More information about the Idna-update mailing list