U-labels, NFC, and symmetry

J-F C. Morfin jfc at morfin.org
Fri Apr 15 21:23:30 CEST 2011


  Yours is an excellent example of the IUI role and of the ML-DNS. 
These two notions are not documented by RFCs yet (due to personal 
delays) but are the normal continuation on the user side of IDNA2008 
on the Internet side.

As you may know, f rom the very beginning of the WG/IDNABis I raised 
the question of its intended final purpose:

  - was it to satisfy the Internet's,
  - or innovation's and the (lead) users' needs. Lead users like you and I.

  The response was clear enough: it was the Internet's needs layers 
only. On this clear target, I promised to document and build the 
response to the users, as a user-application-front-end called the 
ML-DNS, where ML stands for "multi-layer". In a multilingual 
environment it was to support every language at the same level as 
English. The solution was to support a "domain name pile" where the 
Internet domain name corresponds to the two bottom layers (A-label 
and U-label), but where many layers can be added on top to support 
things like:

- metadata that Unicode does not support (such as what is needed for 
French majuscules),
- encrypted domain names,
  - various normalization forms (such as the one you are 
investigating), command lines, etc. each being supported by its own 
layer's library or service. You could conceptually understand it as 
"punyplus code", which will convert back and forth different layers 
(such as yours) into A-labels.

  NB. I evaluate that network written exchanges need a dedicated 
extended network oriented normalization form based on graphical 
premizes + (optionnally) the language being used.

1. That way A-labels can be considered as pivotal among the different 
UDNs formats (User Domain Names) including the Internet Class Domain 
Names (IDN) made of the A-U-label duality. Just remember that a class 
is a consistent particular vision of network protocols. If you can 
use the basic Internet through converters like punycode it will be 
simpler for you. However, what the Internet can deliver, as 
demonstrated by DNA2008, is quite powerful and gradual. Before having 
to consider a different class, you can consider your own 
presentation. There was no presentation layer in the Internet 
technology, yet to support the language diversity that one needs the 
presentation layer was introduced through the "xn--" prefix, which 
represents the "xn" presentation.

So, it is up to you to decide what you need. This is just 
acknowledging that the Internet is able to support the principle of 
subsidiarity, which is necessary to support the users' diversity, 
such as yours.

However, please remember RFC 3439: very large systems, and the 
Internet is a very large system multiplied into a very very large 
system by subsidiarity, are to respect the principle of simplicity. 
Keep it as simple as you can.

2. Now there is another problem. The Internet supports subsidiarity 
without changing a single one of its bits: it is just reading RFCs in 
a different perspective. However, this is not the case for IDNinA, 
the very initial IDN support concept that does not scale. If you have 
two applications on your machine that support two IDNA libraries, 
nothing makes you sure that they will behave the same and resolve the 
same IP for the same U-label.

The consequence is that you need to have the ML-DNS in a single IDN 
(A-label/U-label) management position, i.e. in a distinct layer that 
all the applications will call. Actually when you consider that new 
middle-layer, you quickly see that it can be the place for other 
services to the user and that it is a network architecture new area: 
an Internet use interface. This IUI is actually made of the 
Internet's extended intelligence, which we do not want on a robust 
end to end network and that we always have placed on the fringe. The 
IUI is a the Internet fringe. Its addition leads to a fringe to 
fringe smart extended use of the end to end stratum that itself is a 
smart use of the plug to plug offered by the Telecom stratum.

Then you can quickly see that this IUI can be an extension of the 
Internet, and also a location for intelligent user front-ends, like 
the ML-DNS is with front-ends to the Internet and other technologies, 
that may share their ML-DNS, for example. Then, it becomes the 
Intelligent Use Interface to serve an entire new world (different 
technologies) and stratum (brain to brain semantics, what I call the 
Intersem). When raising the issue with the IESG and IAB (you can see 
that my appeals for clarification on their sites), you see this is 
implicitly the emergence of an entirely new networking area. 
Therefore, one needs a special SDO level to document and experiment it.

I called it the "IUse" emergent community, in turn proposing an IUTF 
liaising with the IETF through the iucg at ietf.org (internet/IETF users 
contributing group) mailing list supported by the 
<http://iucg.org/>http://iucg.org/wiki  (I am currently rather behind 
schedule for different personal and health reasons) needing to use 
the Internet as a basic common central commodity and to extend it to 
suit their individual needs in their own areas, at a minimum 
architectural and developmental cost based on incremental simplicity.



At 23:59 07/04/2011, Peter Saint-Andre wrote:

>RFC 5890 states:
>    o  A "U-label" is an IDNA-valid string of Unicode characters, in
>       Normalization Form C (NFC) and including at least one non-ASCII
>       character, expressed in a standard Unicode Encoding Form (such as
>       UTF-8).  It is also subject to the constraints about permitted
>       characters that are specified in Section 4.2 of the Protocol
>       document and the rules in the Sections 2 and 3 of the Tables
>       document, the Bidi constraints in that document if it contains any
>       character from scripts that are written right to left, and the
>       symmetry constraint described immediately below.  Conversions
>       between U-labels and A-labels are performed according to the
>       "Punycode" specification [RFC3492], adding or removing the ACE
>       prefix as needed.
>    To be valid, U-labels and A-labels must obey an important symmetry
>    constraint.  While that constraint may be tested in any of several
>    ways, an A-label A1 must be capable of being produced by conversion
>    from a U-label U1, and that U-label U1 must be capable of being
>    produced by conversion from A-label A1.  Among other things, this
>    implies that both U-labels and A-labels must be strings in Unicode
>    NFC [Unicode-UAX15] normalized form.  These strings MUST contain only
>    characters specified elsewhere in this document series, and only in
>    the contexts indicated as appropriate.
>I'm updating the i18n handling in XMPP, and the XMPP community would
>like to use NFD on the wire for various reasons. Ideally we would like
>to do so without requiring a trip through NFC. However, it appears that
>we can do this only by using a term other than U-label, since that is
>tied to NFC. Indeed, it seems that a string in Unicode NFD normalized
>form is not an IDN label at all. This strikes me as unfortunate (I
>thought that normalization was handled only in RFC 5895 along with other
>such mapping issues), but probably because I do not understand how the
>symmetry requirement expressed in RFC 5890 necessitates the use of NFC.
>Would any of the i18n experts on this list care to enlighten me on the
>latter point?
>In the meantime, I shall pursue a way to specify XMPP domainparts
>independently of the term U-label.
>Peter Saint-Andre
>Idna-update mailing list
>Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110415/aeff1ad8/attachment-0001.html>

More information about the Idna-update mailing list